.. _gdal_vector_pipeline:

================================================================================
``gdal vector pipeline``
================================================================================

.. versionadded:: 3.11

.. only:: html

    Process a vector dataset.

.. Index:: gdal vector pipeline

Synopsis
--------

.. program-output:: gdal vector pipeline --help-doc=main

A pipeline chains several steps, separated with the `!` (exclamation mark) character.
The first step must be ``read`` or ``concat``, and the last one ``write``. Each step has its
own positional or non-positional arguments. Apart from ``read``, ``concat`` and ``write``,
all other steps can potentially be used several times in a pipeline.

Potential steps are:

* read

.. program-output:: gdal vector pipeline --help-doc=read

* concat

.. program-output:: gdal vector pipeline --help-doc=concat

Details for options can be found in :ref:`gdal_vector_concat`.

* clip

.. program-output:: gdal vector pipeline --help-doc=clip

Details for options can be found in :ref:`gdal_vector_clip`.

* edit

.. program-output:: gdal vector pipeline --help-doc=edit

Details for options can be found in :ref:`gdal_vector_edit`.

* filter

.. program-output:: gdal vector pipeline --help-doc=filter

Details for options can be found in :ref:`gdal_vector_filter`.

* geom

.. program-output:: gdal vector pipeline --help-doc=geom

Details for options can be found in :ref:`gdal_vector_geom`.

* reproject

.. program-output:: gdal vector pipeline --help-doc=reproject

Details for options can be found in :ref:`gdal_vector_reproject`.

* select

.. program-output:: gdal vector pipeline --help-doc=select

Details for options can be found in :ref:`gdal_vector_select`.

* sql

.. program-output:: gdal vector pipeline --help-doc=sql

Details for options can be found in :ref:`gdal_vector_sql`.

* write

.. program-output:: gdal vector pipeline --help-doc=write

Description
-----------

:program:`gdal vector pipeline` can be used to process a vector dataset and
perform various processing steps.

GDALG output (on-the-fly / streamed dataset)
--------------------------------------------

A pipeline can be serialized as a JSON file using the ``GDALG`` output format.
The resulting file can then be opened as a vector dataset using the
:ref:`vector.gdalg` driver, and apply the specified pipeline in a on-the-fly /
streamed way.

The ``command_line`` member of the JSON file should nominally be the whole command
line without the final ``write`` step, and is what is generated by
``gdal vector pipeline ! .... ! write out.gdalg.json``.

.. code-block:: json

    {
        "type": "gdal_streamed_alg",
        "command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632"
    }

The final ``write`` step can be added but if so it must explicitly specify the
``stream`` output format and a non-significant output dataset name.

.. code-block:: json

    {
        "type": "gdal_streamed_alg",
        "command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write --output-format=streamed streamed_dataset"
    }


Examples
--------

.. example::
   :title: Reproject a GeoPackage file to CRS EPSG:32632 ("WGS 84 / UTM zone 32N")

   .. code-block:: bash

        $ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write out.gpkg --overwrite

.. example::
   :title: Serialize the command of a reprojection of a GeoPackage file in a GDALG file, and later read it

   .. code-block:: bash

        $ gdal vector pipeline --progress ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write in_epsg_32632.gdalg.json --overwrite
        $ gdal vector info in_epsg_32632.gdalg.json

.. example:: Union 2 source shapefiles (with similar structure), reproject them to EPSG:32632, keep only cities larger than 1 million inhabitants and write to a GeoPackage
   :title:

   .. code-block:: bash

        $ gdal vector pipeline --progress ! concat --single --dst-crs=EPSG:32632 france.shp belgium.shp ! filter --where "pop > 1e6" ! write out.gpkg --overwrite
