[gdal-dev] inconsistent naming for the CLI "input" argument
Chris Toney
jctoney at gmail.com
Sun Apr 5 11:22:16 PDT 2026
Even,
Thanks for the comments. That helps a lot, if I understand correctly now:
"input" alone - not modified
"dataset" alone - modified / is input and output
"input" with "dataset" alias - the input may be modified
Going back through with that in mind, it looks largely consistent with
a small number of exceptions. Starting with "raster edit", it has only
"dataset" which is input and output, then...
"raster info"
-i, --dataset, --input <INPUT>
Makes sense, input may be modified, e.g., if --stats are computed they
may be set back on the input dataset.
"vector info"
-i, --dataset, --input <INPUT>
Same, input may be modified if --sql is given for UPDATE, DELETE, etc.
Exceptions:
"vector sql"
-i, --input <INPUT>
Should also have "dataset" alias?
>Starting with GDAL 3.12, when using --update, and without an output dataset specified, this can be used to execute statements that modify the input dataset, such as UPDATE, DELETE, etc.
"gdal info"
-i, --input <INPUT>
Add "dataset" alias to align with "raster info" and "vector info"?
(and "mdim info" also has "dataset" alias)
Then these three probably should be aligned:
"raster overview add"
-i, --dataset, --input <INPUT> Dataset (to be updated in-place,
unless --external)
"raster overview delete"
--dataset <DATASET> Dataset (to be updated in-place, unless --read-only)
"raster overview refresh"
--dataset <DATASET> Dataset (to be updated in-place, unless --external)
This one has "dataset" alias but does not modify input, seems to be a
lone exception in that regard:
"raster update"
-i, --dataset, --input <INPUT>
-o, --output <OUTPUT>
I think "dataset" should be kept based on that understanding.
I don't have strong feelings on the <meta_var> names other than
consistency is good. The existing "inconsistencies" are pretty minor
so not sure if changes are really needed.
The aim is not to be super nitpicky over arg names and their aliases.
Motivation is API usage where application code takes dataset objects
as user input to CLI algorithms. The dataset objects may carry
information that should be used to parameterize the algorithm call.
The user may have already set properties on the object and should not
have to provide those explicitly again when passing to an algorithm.
The algorithm arg names must be used in parsing in some cases to infer
meaning, i.e., we cannot always rely only on querying properties of
the AlgorithmArg object. An example is "like" / "like-layer" /
"like-sql" / "like-where". Parsing for those needs to rely on the arg
names (and any potential aliases) for meaning. So the names/aliases
really should be always consistent, which I believe they are in that
case.
"input" and "dataset" aren't quite the same since we can query the
AlgorithmArg object and determine what it's for. But the more
consistent they can be in usage the better IMO.
Chris
On Sun, Apr 5, 2026 at 4:21 AM Even Rouault <even.rouault at spatialys.com> wrote:
>
> Hi Chris,
>
> > The main issue is the occasional use of "dataset" as an alias for
> > "input". It's inconsistently available as an alias which seems not
> > ideal, but it also shows up in unexpected ways.
> >
> > "raster edit" has only --dataset with no --input or -i:
> > --dataset <DATASET>
> The rationale was that raster edit only takes a single dataset which is
> both input and output. Input could also suggest that it won't be
> modified, which is not the case here. But I see we have hesitated in
> different similar (or similar looking, but subtely different) situations
> if we needed to expose input, dataset or both.
> >
> > "raster overview add" has:
> > -i, --dataset, --input <INPUT>
> >
> > But "raster overview delete" and "raster overview refresh" have only:
> > --dataset <DATASET>
> >
> > A dataset-specific one "dataset check" doesn't use it:
> > -i, --input <INPUT>
> For dataset check, the dataset isn't modified.
> >
> > Is the "dataset" alias really worth having?
> Good question. Happy to hear about other's opinion on this.
> >
> > A couple others are unique cases that may not be a problem. These just
> > stand out as different since meta_var rarely deviates from the naming
> > pattern.
> >
> > "raster calc" has:
> > -i, --input <INPUTS>
> Plural to suggest you can specify several ones
> >
> > "raster blend" has:
> > -i, --color-input, --input <COLOR-INPUT>
> The metavar is important to remind the semantics because it accepts a
> second input dataset : --overlay <OVERLAY>
> >
> > Those are the only cases I've found where the meta_var name is
> > different than the long name. Nearly all have <INPUT> for --input even
> > if there is an alias, e.g., "raster pansharpen" has `-i,
> > --panchromatic, --input <INPUT>`.
>
> Similar to raster blend:
>
> -i, --panchromatic, --input <INPUT> Input panchromatic raster dataset
> [required]
> --spectral <SPECTRAL> Input spectral band dataset [1.. values] [required]
>
> The input name helps here to remember which dataset is implicit or not
> when you use it in a pipeline context (input must not be specified as
> the result of the previous step):
>
> gdal raster pipeline read panchro.tif ! pansharpen multispectral.tif !
> write out.tif
>
>
> > I checked several others that can
> > take multiple input datasets, and "raster calc" is the only one I
> > found with plural INPUTS. Maybe that's not a big deal because the
> > meta_var is only for display in the documentation? It could still be
> > worth making them consistent for readability.
> Yes we could put plural INPUTS in other situations where input accepts
> multiple files
> > Since <INPUT> is almost
> > always used for the positional input dataset(s), when I see
> > <COLOR-INPUT> it looks like possibly something other than a raster
> > dataset.
> Should the metavar of INPUT be INPUT-DATASET in general case, and
> COLOR-INPUT-DATASET for blend / PANCHRO-DATASET for pansharpen ?
>
> --
> http://www.spatialys.com
> My software is free, but my time generally not.
>
More information about the gdal-dev
mailing list