.. _raster.bag:

================================================================================
BAG -- Bathymetry Attributed Grid
================================================================================

.. shortname:: BAG

.. build_dependencies:: libhdf5

This driver provides read-only and creation support for bathymetry data in the
BAG format. BAG files are actually
a specific product profile in an HDF5 file, but a custom driver exists
to present the data in a more convenient manner than is available
through the generic HDF5 driver.

BAG files have two image bands representing Elevation (band 1),
Uncertainty (band 2) values for each cell in a raster grid area.
Nominal Elevation values may also be present, and starting with GDAL 3.2, any
2D array of numeric data type and with the same dimensions as Elevation present
under BAG_root will be reported as a GDAL band.

The geotransform and coordinate system is extracted from the internal
XML metadata provided with the dataset. However, some products may have
unsupported coordinate system formats, if using the non-WKT way of
encoding the spatial reference system.

The full XML metadata is available in the "xml:BAG" metadata domain.

Nodata, minimum and maximum values for each band are also reported.

Driver capabilities
-------------------

.. supports_createcopy::

.. supports_create::

    This driver supports the :cpp:func:`GDALDriver::Create` operation

    .. versionadded:: 3.2

.. supports_georeferencing::

.. supports_virtualio::

Open options
------------

|about-open-options|
For open options specific to variable resolution, see following chapter.

Other open options are:

- .. oo:: REPORT_VERTCRS
     :choices: YES, NO
     :default: YES
     :since: 3.2

     To report
     the vertical CRS from BAG XML metadata as the vertical component of a
     compound CRS. If set to NO, only the horizontal part will be reported.

Variable resolution (VR) grid support
-------------------------------------

GDAL can handle BAG files with `variable resolution
grids <https://bitbucket.org/ccomjhc/openns/raw/master/docs/VariableResolution/2017-08-10_VariableResolution.docx>`__.
Such datasets are made of a low-resolution grid, which is the one
presented by default by the driver, and for each of those low-resolution
cells, a higher resolution grid can be present in the file. Such higher
resolution grids are dubbed "supergrids" in GDAL.

The driver has different modes of working which can be controlled by the
MODE open option:

.. oo:: MODE
   :choices: LOW_RES_GRID, LIST_SUPERGRIDS, RESAMPLED_GRID, INTERPOLATED, AUTO
   :default: AUTO

   Driver mode.

   -  MODE=LOW_RES_GRID: this is the default mode (unless opening a supergrid). The driver will expose
      the low resolution grid, and indicate in the dataset metadata if the
      dataset has supergrids (HAS_SUPERGRIDS=TRUE), as well as the minimum
      and maximum resolution of those grids.

   -  MODE=LIST_SUPERGRIDS: in this mode, the driver will report the
      various supergrids in the subdataset list. It is possible to apply in
      this mode additional open options to restrict the search

      -  .. oo:: SUPERGRIDS_INDICES
            :choices: <(y1\,x1)\,(y2\,x2)\,...>

            Tuple or list of tuples,
            of supergrids described by their y,x indices (starting from 0, y
            from the south of the grid, x from the west o the grid).

      -  MINX=value: Minimum georeferenced X value to use as a filter for
         the supergrids to list.
      -  MINY=value: Minimum georeferenced Y value to use as a filter for
         the supergrids to list.
      -  MAXX=value: Maximum georeferenced X value to use as a filter for
         the supergrids to list.
      -  MAXY=value: Maximum georeferenced Y value to use as a filter for
         the supergrids to list.
      -  RES_FILTER_MIN=value: Minimum resolution of supergrids to take
         into account (excluded bound)
      -  RES_FILTER_MAX=value: Maximum resolution of supergrids to take
         into account (included bound)

   -  Opening a supergrid. This mode is triggered by using as a dataset
      name a string formatted like BAG:my.bag:supergrid:{y}:{x}, which is
      the value of the SUBDATASET_x_NAME metadata items reported by the
      above described mode. {y} is the index (starting from 0, from the
      south of the grid), and {x} is the index (starting from 0, from the
      west of the grid) of the supergrid to open.

   -  MODE=RESAMPLED_GRID: in this mode, the user specify the extent and
      resolution of a target grid, and for each cell of this target grid,
      the driver will find the nodes of the supergrids that fall into that
      cell. By default, it will select the node with the maximum elevation
      value to populate the cell value. Or if no node of any supergrid are
      found, the cell value will be set to the nodata value.
      Overviews are reported: note that, those
      overviews correspond to resampled grids computed with different
      values of the RESX and RESY parameters, but using the same value
      population rules (and not nearest neighbour resampling of the full
      resolution resampled grid). The last overview level is the
      low-resolution grid (if SUPERGRIDS_MASK=NO)

      The available open options in this mode are:

      -  MINX=value: Minimum georeferenced X value for the resampled grid.
         By default, the corresponding value of the low resolution grid.
      -  MINY=value: Minimum georeferenced Y value for the resampled grid.
         By default, the corresponding value of the low resolution grid.
      -  MAXX=value: Maximum georeferenced X value for the resampled grid.
         By default, the corresponding value of the low resolution grid.
      -  MAXY=value: Maximum georeferenced Y value for the resampled grid.
         By default, the corresponding value of the low resolution grid.
      -  RESX=value: Horizontal resolution. By default, and if RES_STRATEGY
         is set to AUTO, this will be the minimum resolution among all the
         supergrids.
      -  RESY=value: Vertical resolution (positive value). By default, and
         if RES_STRATEGY is set to AUTO, this will be the minimum
         resolution among all the supergrids.
      -  RES_STRATEGY=AUTO/MIN/MAX/MEAN: Which strategy to apply to set the
         resampled grid resolution. By default, if none of RESX, RESY,
         RES_FILTER_MIN and RES_FILTER_MAX is specified, the AUTO strategy
         will correspond to the MIN strategy: that is the minimum
         resolution among all the supergrids is used. If MAX is specified,
         the maximum resolution among all the supergrids is used. If MEAN
         is specified, the mean resolution among all the supergrids is
         used. RESX and RESY, if defined, will override the resolution
         determined by RES_STRATEGY.
      -  RES_FILTER_MIN=value: Minimum resolution of supergrids to take
         into account (excluded bound, except if it is the minimum
         resolution of supergrids). By default, the minimum resolution of
         supergrids available. If this value is specified and none of
         RES_STRATEGY, RES_FILTER_MAX, RESX or RESY is specified, the
         maximum resolution among all the supergrids will be used as the
         resolution for the resampled grid.
      -  RES_FILTER_MAX=value: Maximum resolution of supergrids to take
         into account (included bound). By default, the maximum resolution
         of supergrids available. If this value is specified and none of
         RES_STRATEGY, RESX or RESY is specified, this will also be used as
         the resolution for the resampled grid.
      -  VALUE_POPULATION=MIN/MAX/MEAN/COUNT: Which value population strategy to
         apply to compute the resampled cell values. This default to MAX:
         the elevation value of a target cell is the maximum elevation of
         all supergrid nodes (potentially filtered with RES_FILTER_MIN
         and/or RES_FILTER_MAX) that fall into this cell; the corresponding
         uncertainty will be the uncertainty of the source node where this
         maximum elevation si reached. If no supergrid node fall into the
         target cell, the nodata value is set. The MIN strategy is similar,
         except that this is the minimum elevation value among intersecting
         nodes that is selected. The MEAN strategy uses the mean value of
         the elevation of intersecting nodes, and the maximum uncertainty
         of those nodes.
         The COUNT strategy (GDAL >= 3.2) exposes one single UInt32 band where
         each target cell contains the count of supergrid nodes that fall into it.
      -  SUPERGRIDS_MASK=YES/NO. Default to NO. If set to YES, instead of
         the elevation and uncertainty band, the dataset contains a single
         Byte band which is boolean valued. For a target cell, if at least
         one supergrids nodes (potentially filtered with RES_FILTER_MIN
         and/or RES_FILTER_MAX) falls into the cell, the cell value is set
         at 255. Otherwise it is set at 0. This can be used to distinguish
         if elevation values at nodata are due to no source supergrid node
         falling into them, or if that/those supergrid nodes were
         themselves at the nodata value.
      -  NODATA_VALUE=value. Override the default value, which is usually
         1000000.

   -  MODE=INTERPOLATED: (GDAL >= 3.8) in this mode, the user specify the extent and
      resolution of a target grid, and for each cell of this target grid,
      the driver will interpolate the value from the surrounding nodes of the
      supergrid(s) that applies to the cell, using in priority bilinear
      interpolation (for a node that falls within a supergrid and when all
      4 surrounding nodes of the supergrid are not at nodata value), and
      fallbacks to barycentric interpolation or inverse distance weighting in
      other situations.

      Overviews are reported. Note that those overviews correspond to resampled
      grids computed with different values of the RESX and RESY parameters, but
      for performance reasons, the interpolation is still limited to the immediate
      neighbours of the target grid in the supergrid(s), which result in
      the equivalent of nearest-neighbour downsampling of the full resolution
      raster. The last overview level is the low-resolution grid.

      The available open options in this mode are MINX, MINY, MAXX, MAXY,
      RESX, RESY, RES_STRATEGY, RES_FILTER_MIN, RES_FILTER_MAX and NODATA_VALUE
      (cf their above description for the MODE=RESAMPLED_GRID)

Spatial metadata support
------------------------

Starting with GDAL 3.2, GDAL can expose BAG files with `spatial metadata
<https://github.com/OpenNavigationSurface/BAG/issues/2>`__.

When such spatial metadata is present, the subdataset list will include
names of the form 'BAG:"{filename}":georef_metadata:{name_of_layer}'
where ``name_of_layer`` is the name of a subgroup under ``/BAG_root/Georef_metadata``

The values of the ``keys`` dataset under each metadata layer are used as the
GDAL raster value. And the corresponding ``values`` dataset is exposed as a
GDAL Raster Attribute Table associated to the GDAL raster band. If ``keys``
is absent, record 1 of ``values`` is assumed to be met for each elevation point
that does not match the nodata value of the elevation band.

When variable resolution grids are present, the MODE=LIST_SUPERGRIDS open option
will cause subdatasets of names of the form 'BAG:"{filename}":georef_metadata:{name_of_layer}:{y}:{x}'
to be reported. When opening such a subdataset, the ``varres_keys`` dataset will
be used to populate the GDAL raster value.
If ``varres_keys`` is absent, record 1 of ``values`` is assumed to be met for
each elevation point that does not match the nodata value of the variable resolution
elevation band.

Tracking list support
---------------------

When the dataset is opened in vector mode (ogrinfo, ogr2ogr, etc.), the tracking_list
dataset will be reported as a OGR vector layer

Creation support
----------------

The driver can create a BAG dataset (without
variable resolution extension) with the elevation and uncertainty bands
from a source dataset. The source dataset must be georeferenced, and
have one or two bands. The first band is assumed to be the elevation
band, and the second band the uncertainty band. If the second band is
missing, the uncertainty will be set to nodata.

The driver will instantiate the BAG XML metadata by using a template
file, which is by default,
`bag_template.xml <https://raw.githubusercontent.com/OSGeo/gdal/master/data/bag_template.xml>`__,
found in the GDAL data definition files. This template contains
variables, present as ${KEYNAME} or ${KEYNAME:default_value} in the XML
file, that can be substituted by providing a creation option whose name
is the VAR\_ string prefixed to the key name.

|about-creation-options|
The following creation options are supported:

-  .. co:: VAR_INDIVIDUAL_NAME
      :default: unknown

      String to fill contact/CI_ResponsibleParty/individualName.

-  .. co:: VAR_ORGANISATION_NAME
      :default: unknown

      String to fill contact/CI_ResponsibleParty/organisationName.

-  .. co:: VAR_POSITION_NAME
      :default: unknown

      String to fill contact/CI_ResponsibleParty/positionName.

-  .. co:: VAR_DATE
      :choices: <YYYY-MM-DD>
      :default: current date

      Value to fill dateStamp/Date.

-  .. co:: VAR_VERT_WKT

      WKT string to fill
      referenceSystemInfo/MD_ReferenceSystem/referenceSystemIdentifier/RS_Identifier/code
      for the vertical coordinate reference system. If not provided, and if
      the input CRS is not a compound CRS, default to VERT_CS["unknown",
      VERT_DATUM["unknown", 2000]].

-  .. co:: VAR_ABSTRACT
      :default: <empty string>

      String to fill identificationInfo/abstract.

-  .. co:: VAR_PROCESS_STEP_DESCRIPTION
      :default: Generated by GDAL x.y.z

      String to fill dataQualityInfo/lineage/LI_Lineage/processStep/LI_ProcessStep/description.

-  .. co:: VAR_DATETIME
      :choices: <YYYY-MM-DDTHH:MM:SS>
      :default: current datetime

      Value to fill
      dataQualityInfo/lineage/LI_Lineage/processStep/LI_ProcessStep/dateTime/DateTime.

-  .. co:: VAR_RESTRICTION_CODE
      :choices: <enumerated_value>
      :default: otherRestrictions

      Value to fill
      metadataConstraints/MD_LegalConstraints/useConstraints/MD_RestrictionCode.

-  .. co:: VAR_OTHER_CONSTRAINTS
      :default: unknown

      String to fill metadataConstraints/MD_LegalConstraints/otherConstraints.

-  .. co:: VAR_CLASSIFICATION
      :choices: <enumerated_value>
      :default: unclassified

      Value to fill
      metadataConstraints/MD_SecurityConstraints/classification/MD_ClassificationCode.

-  .. co:: VAR_SECURITY_USER_NOTE
      :default: none

      String to fill metadataConstraints/MD_SecurityConstraints/userNote.

Other required variables found in the template, such as RES, RESX, RESY,
RES_UNIT, HEIGHT, WIDTH, CORNER_POINTS and HORIZ_WKT will be
automatically filled from the input dataset metadata.

The other following creation options are available:

-  .. co:: TEMPLATE
      :choices: <filename>

      Path to a XML file that can serve as a template.
      This will typically be a customized version of the base
      bag_template.xml file. The file can contain other substitutable
      variables than the ones mentioned above by using a similar syntax.

-  .. co:: VAR_xxxx

      Substitute variable ${xxxx} in the template XML value
      by the provided value.

-  .. co:: BAG_VERSION
      :default: 1.6.2

      Value to write in the /BAG_root/BAG Version attribute.

-  .. co:: COMPRESS
      :choices: NONE, DEFLATE
      :default: DEFLATE

      Compression for elevation and uncertainty grids.

-  .. co:: ZLEVEL
      :choices: 1-9
      :default: 6

      Deflate compression level.

-  .. co:: BLOCK_SIZE
      :choices: <integer>

      Chunking size of the HDF5 arrays. Default
      to 100, or the maximum dimension of the raster if smaller than 100.

Usage examples
--------------

-  Opening in low resolution mode:

   ::

      $ gdalinfo data/test_vr.bag

      [...]
      Size is 6, 4
      [...]
        HAS_SUPERGRIDS=TRUE
        MAX_RESOLUTION_X=29.900000
        MAX_RESOLUTION_Y=31.900000
        MIN_RESOLUTION_X=4.983333
        MIN_RESOLUTION_Y=5.316667
      [...]

-  Displaying available supergrids:

   ::

      $ gdalinfo data/test_vr.bag -oo MODE=LIST_SUPERGRIDS

      [...]
      Subdatasets:
        SUBDATASET_1_NAME=BAG:"data/test_vr.bag":supergrid:0:0
        SUBDATASET_1_DESC=Supergrid (y=0, x=0) from (x=70.100000,y=499968.100000) to (x=129.900000,y=500031.900000), resolution (x=29.900000,y=31.900000)
        SUBDATASET_2_NAME=BAG:"data/test_vr.bag":supergrid:0:1
        SUBDATASET_2_DESC=Supergrid (y=0, x=1) from (x=107.575000,y=499976.075000) to (x=152.424999,y=500023.924999), resolution (x=14.950000,y=15.950000)
      [...]
        SUBDATASET_24_NAME=BAG:"data/test_vr.bag":supergrid:3:5
        SUBDATASET_24_DESC=Supergrid (y=3, x=5) from (x=232.558335,y=500077.391667) to (x=267.441666,y=500114.608334), resolution (x=4.983333,y=5.316667)
      [...]

-  Opening a particular supergrid:

   ::

      $ gdalinfo BAG:"data/test_vr.bag":supergrid:3:5

-  Converting a BAG in resampling mode with default parameters (use of
   minimum resolution of supergrids, MAX value population rule):

   ::

      $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID out.tif

-  Converting a BAG in resampling mode with a particular grid origin and
   resolution

   ::

      $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo MINX=80 -oo MINY=500000 -oo RESX=16 -oo RESY=16 out.tif

-  Converting a BAG in resampling mode, with a mask indicating where
   supergrids nodes intersect the cell of the resampled dataset.

   ::

      $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo SUPERGRIDS_MASK=YES out.tif

-  Converting a BAG in resampling mode, by filtering on supergrid
   resolutions (and the resampled grid will use 4 meter resolution by
   default)

   ::

      $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo RES_FILTER_MIN=4 -oo RES_FILTER_MAX=8 out.tif

-  Converting a GeoTIFF file to a BAG dataset, and provide a custom
   value for the ABSTRACT substitutable variable.

   ::

      $ gdal_translate in.tif out.bag -co "VAR_ABSTRACT=My abstract"

-  Converting a (VR) BAG in resampling mode with a particular grid
   resolution (5m) to a BAG dataset (without variable resolution
   extension), and provide a custom value for the ABSTRACT metadata:

   ::

      $ gdal_translate data/test_vr.bag -oo MODE=RESAMPLED_GRID -oo RESX=5 -oo RESY=5 out.bag -co "VAR_ABSTRACT=My abstract"

-  Displaying the tracking list:

   ::

      $ ogrinfo -al data/my.bal

See Also
--------

-  Implemented as :source_file:`frmts/hdf5/bagdataset.cpp`.
-  `The Open Navigation Surface Project <http://www.opennavsurf.org>`__
-  `Description of Bathymetric Attributed Grid Object (BAG) Version
   1.6 <https://github.com/OpenNavigationSurface/BAG/raw/master/docs/BAG_FSD_Release_1.6.3.doc>`__
-  `Variable resolution grid extension for BAG
   files <https://github.com/OpenNavigationSurface/BAG/raw/master/docs/VariableResolution/2017-08-10_VariableResolution.docx>`__
-  :ref:`S-102 driver <raster.s102>`
