[gdal-dev] Re: [Live-demo] Image for the GDAL overview
Hamish
hamish_b at yahoo.com
Sun Feb 27 07:18:57 EST 2011
Even wrote:
> If I understand well what you are trying to do (try to
> evaluate the relative "popularity" of formats to present it
> graphically ?),
Yes. For the osgeo liveDVD demo disc we're trying to put together
a screenshot for the GDAL overview page. Screenshots of CLI or
libs are not very interesting, so we're being creative.
here's the current one (ignore image scaling, that's a sphinx
bug on the adhoc server; imagine it at 60% size),
http://adhoc.osgeo.osuosl.org/livedvd/docs/en/overview/gdal_overview.html
> I'm afraid the method used here will lead to lots of false
> results and a biased view of the "reality"
>
> For example,
>
> * http://www.google.com/search?q=GMT shows that GMT is mainly
> Greenwich Mean Time, Giant Magellan Telescope, Generative
> Modeling Technologies, ...
yes I know, that's why I pulled that one out as an example of
search terms which would need to be adjusted.
for GMT I think I'll change the search term to `"GMT+grid"`
and see how that does.
> * when I try "Portable Network Graphics", I get 1070000,
> and not 998000 (I've the feeling that the results given by
> Google depend somehow of your IP address) .
yes, rather frustrating. I find it often tells me the answers
already known to me and not the other uses (and I'm doing a
search for things I don't know, not for what I do know). hard to
treat as unbiased results.. anyway I do it while logged out
of gmail if that helps. :)
e.g., if I do a search for "GRASS" while logged into gmail I
get "GRASS GIS" as the #2 result, after wikipedia. I could hope,
but I doubt random users see the same list. and a lot of hits
for "GRASS in $HOME_COUNTRY".
> And it will probably lead to far less hits than "PNG" which the
> more popular term for it (but which is also Papua New
> Guinea...). PNG gives 1 010 000 000 hits !!!
For generic names I think we could add `"$NAME"+GIS` to the
search to remove some of the clutter.
> * "Shapefile" --> 2 570 000 but "ESRI Shapefile"
> only 183 000
right, more refinement needed there.
> * FIT and FITS have incredible high hits, while being quite
> obscure formats.
> The reason for that popularity burst is that they are
> mainly English verbs...
+GIS or +raster ?
> * KAK is not a very significant name (and none of the
> results in the first page of Google shows a link with the
> format) for the JP2KAK driver, which is itself
> a GDAL-only codename for JPEG2000 with Kakadu library.
so perhaps reasonably accurate.
> * "PostgreSQL/PostGIS" --> 58 500, but "PostgreSQL"
> --> 9 650 000 and "PostGIS" --> 324 000 ...
probably reduce that one to just PostGIS.
> etc etc...
> What would be more interesting would be to install a spy in
> the GDAL lib which would connect to a web site and increase
> the hit count of the appropriate driver for each successful
> opening of a dataset :-)
users would love us for that.. :-)
(fwiw, see popcon package in debian/ubuntu)
> And I'm pretty sure you would see geotiff and shapefile appear
> in the top of the list.
yeah, the idea isn't to do an exhaustive scientific search of
what is the most popular formats, just to provide some weighting
to a font-sizing algorithm in a promo image.
I feel my own use of GDAL will be strongly skewed to a few
formats on the list that I use in my field, and not necessarily
what is most used by other users.
for shared editing of the search terms, I've thrown them up
here:
http://wiki.osgeo.org/wiki/GDAL_format_popularity
Hamish
More information about the gdal-dev
mailing list