[gdal-dev] Re: [Live-demo] Image for the GDAL overview

Hamish hamish_b at yahoo.com
Sun Feb 27 04:18:57 PST 2011

Even wrote:
> If I understand well what you are trying to do (try to
> evaluate the relative "popularity" of formats to present it
> graphically ?),

Yes. For the osgeo liveDVD demo disc we're trying to put together
a screenshot for the GDAL overview page. Screenshots of CLI or
libs are not very interesting, so we're being creative.

here's the current one (ignore image scaling, that's a sphinx
bug on the adhoc server; imagine it at 60% size),

> I'm afraid the method used here will lead to lots of false
> results and a biased view of the "reality"
> For example,
> * http://www.google.com/search?q=GMT shows that GMT is mainly
>  Greenwich Mean Time, Giant Magellan Telescope, Generative
> Modeling Technologies, ...

yes I know, that's why I pulled that one out as an example of
search terms which would need to be adjusted.

for GMT I think I'll change the search term to `"GMT+grid"`
and see how that does.

> * when I try "Portable Network Graphics", I get 1070000,
> and not 998000 (I've the feeling that the results given by
> Google depend somehow of your IP address) .

yes, rather frustrating. I find it often tells me the answers
already known to me and not the other uses (and I'm doing a
search for things I don't know, not for what I do know). hard to
treat as unbiased results.. anyway I do it while logged out
of gmail if that helps. :)

e.g., if I do a search for "GRASS" while logged into gmail I
get "GRASS GIS" as the #2 result, after wikipedia. I could hope,
but I doubt random users see the same list. and a lot of hits

> And it will probably lead to far less hits than "PNG" which the 
> more popular term for it (but which is also Papua New
> Guinea...). PNG gives 1 010 000 000 hits !!!

For generic names I think we could add `"$NAME"+GIS` to the
search to remove some of the clutter.

> * "Shapefile" --> 2 570 000  but "ESRI Shapefile"
> only 183 000

right, more refinement needed there.

> * FIT and FITS have incredible high hits, while being quite
> obscure formats. 
> The reason for that popularity burst is that they are
> mainly English verbs... 

+GIS or +raster ?

> * KAK is not a very significant name (and none of the
> results in the first page of Google shows a link with the
> format) for the JP2KAK driver, which is itself 
> a GDAL-only codename for JPEG2000 with Kakadu library.

so perhaps reasonably accurate.
> * "PostgreSQL/PostGIS" --> 58 500, but "PostgreSQL"
> --> 9 650 000 and "PostGIS" --> 324 000 ...

probably reduce that one to just PostGIS.

> etc etc...

> What would be more interesting would be to install a spy in
> the GDAL lib which would connect to a web site and increase
> the hit count of the appropriate driver for each successful
> opening of a dataset :-)

users would love us for that.. :-)
(fwiw, see popcon package in debian/ubuntu)

> And I'm pretty sure you would see geotiff and shapefile appear
> in the top of the list.

yeah, the idea isn't to do an exhaustive scientific search of
what is the most popular formats, just to provide some weighting
to a font-sizing algorithm in a promo image.

I feel my own use of GDAL will be strongly skewed to a few
formats on the list that I use in my field, and not necessarily
what is most used by other users.

for shared editing of the search terms, I've thrown them up



More information about the Osgeolive mailing list