[PROJ] Switch to proj-datumgrid-geotiff for PROJ 7 ?

Greg Troxel gdt at lexort.com
Mon Jan 13 09:16:27 PST 2020


Even Rouault <even.rouault at spatialys.com> writes:

>> You say "maintenance", but would there be new releases of packages
>> derived from proj-datumgrid, for example for the benefit of proj5/6
>> users that have no upgraded?  Or do you mean "it will just sit there as
>> an archive"?
>
> That's an open question. As long as we maintain PROJ 6.x, we need at a minimum 
> to maintain proj-datumgrid. But the maintainance might be minimal. Just accept 
> fixes to currently delivered material, and new content would be for
> proj-datumgrid-geotiff. And at some point, proj-datumgrid would be frozen.

Sounds reasonable.

>> > Currently:
>> > proj-datumgrid: total size: 703 MB as 5 .zip and 1.5 GB uncompressed
>> > proj-datumgrid-geotiff: total size: 486 MB
>> 
>> Do you expect the sizes for the same data to be different?  It seems
>> obvious that every file in the old directory needs to be transformed to
>> tif and put in the new one -- but again I may be missing something.
>
> The content in both repositories is the same. proj-datumgrid-geotiff files are 
> smaller because they use the TIFF Predictor mechanism which increases the 
> efficiency of deflate compression over what .zip can do, hence 486 MB < 703 
> MB.

That'a a good reason to switch, especially as the TIFF compression
remains on people's disks, whereas unzipping on distribution does not.

>> So if anything, I would think the repo should be split up into more
>> archives.  The current regions seem sensible, and then there perhaps is
>> another axis of normal things vs. esoteric things.   Right now I can't
>> articulate that and I am not sure that makes sense.
>> 
>> So for now, I would advise not changing the archive split plan, until we
>> have a good basis for believing that some other plan is good.
>
> Actually, in the github repo, I've changed the organization to be based on the 
> producer, mostly to make area of responisibility clearer:
> https://github.com/osgeo/proj-datumgrid-geotiff
> But currently everything is bundled in a single archive.

I view single archive as a breaking change, whereas a mere change of
format not, from the packaging viewpoint.  As I said, pkgsrc is
currently just grabbing them all, but this is becoming less reasonable.
At 500 MB, though, it doesn't seem terrible.  4GB would be something
else.


>> It seems really clear to me that these sorts
>> of asked-for operations are entirely necessary for the whole system to
>> make sense.  Surely there would be download by name.  It would be really
>> nice if one could ask "show me the transform pipelines that this request
>> would invoke" and also to get from that "this is the list of grids you
>> need and don't have for one or all" and then to be able to get them.
>
> projinfo and related API already report which grids are missing.

Great, and it would be nice to have a programmatic way to fetch them
from having run projinfo.

>> So what if someone has not enabled RFC4, and asks for a transform that
>> would use a grid if it were there.  Instead of downloading dynamically,
>> what happens?  I have always wanted that to throw an exception, unless
>> the user has disabled something, but I know i'm way on one side on the
>> "consistent outputs" side of things.
>
> RFC4 has not changed anything on that side. We cannot impose a particular grid 
> to be used, because sometimes the guess done by PROJ is not necessarily the 
> most appropriate (because of implementation limitations, or sometimes just 
> because it needs a human brain to decide).

I am not really following this.  I get it that sometimes humans need to
choose which approach is best.

What I meant is that in a world where all the grid files are on the
disk, one can say "show me all the plausible transforms, sorted by some
error metric".  If the grid file isn't present, I'd still like that
process to be able to operate the same way, but essentialy throwing an
execption if a grid file that's needed isn't there -- as opposed to
pruning pipelines that make sense but which cannot currently be done
locally.  I realize this is a discusion about the pre-RFC4 world, but to
me this is part of avoiding those not opting into RFC4 becoming
second-class.


More information about the PROJ mailing list