[Benchmarking] Vector (OSM) data prep/plan

Smith, Michael ERDC-CRREL-NH michael.smith at usace.army.mil
Wed Feb 23 08:03:05 EST 2011


A further thought. There are some in this (Oracle MapViewer leaps to mind)
that won't be pulling form PostGis but rather from Oracle. But the actual
geometries being rendered should be the same (eg, export the data from
PostGIS to Oracle) to keep and Apples to Apples comparison. I think Best
Effort, could be the storage format that your team feels is most optimal for
use but the data (geometry) itself should be identical (or as close as is
possible). And for those teams that don't do PostGis, would Oracle be
considered BaseLine or Best Effort? I guess I'm still a little unclear (as
this email demonstrates) where one begins and the other ends.

Mike


-- 
Michael Smith
Remote Sensing/GIS Center
US Army Corps of Engineers
 

On 2/23/11 6:04 AM, "Pirmin Kalberer" <pi_ml at sourcepole.com> wrote:

> Hi Dane,
> Since I can't attend the meeting today, some thougts by mail. I have some
> concerns that in your "best effort" proposal we will test the data preparation
> and not the WMS performance. For me the "baseline" test is good enough and I
> support that. We already have a styling for Geofabrik shapefiles imported with
> shp2pgsl, which look similar to the Cloudmade shapefiles. As I mentioned in
> the 
> IRC session, my favorite OSM to PostGIS mapping is OSM-in-a-box
> (http://dev.ifs.hsr.ch/redmine/projects/osminabox/wiki). Advantages are
> customizable mappings and and incremental imports. It would also allow to
> produce an import for a defined reference date, which is important for a
> reproducible benchmark.
> Pirmin
> 
> 
> Am Mittwoch, 23. Februar 2011, um 07.27:34 schrieb Dane Springmeyer:
>> For the meeting today
>> (http://wiki.osgeo.org/wiki/Benchmarking_2011#Next_IRC_Meeting) I'd like
>> to discuss the Vector data plan.
>> 
>> Short version:
>> 
>> My rough proposal is to plan for *both* a 1) "baseline" test using OSM data
>> in a postgis database imported from cloudmade shapefiles using shp2pgsql
>> (or some other commonly shared approach) and a 2) "best effort" in which
>> teams are encouraged to come up with better import and storage mechanisms
>> and the only limitation is that the styles result in visually identical
>> rendered tiles to the baseline tiles. Both the baseline and best effort
>> would be presented in Denver, but teams would only be mandated to provide
>> results for the baseline (the reason being that smaller teams may not have
>> the time or resources to complete a best effort approach).
>> 
>> I volunteer to help process OSM data for the baseline test, depending on
>> what people want to see.
>> 
>> Also, below I provide additional thoughts (the long version) on why I think
>> including a baseline test is important.
>> 
>> Cheers,
>> 
>> Dane
>> 
>> ----------------
>> 
>> As I understand from previous meetings, it was decided that OSM data for
>> Colorado would be a good test candidate. This is great.
>> 
>> I also understand that the approach would be "best effort" - meaning that
>> the method of processing the OSM data into a format suitable for rendering
>> would be up to the desires of each team.
>> 
>> This makes a lot of sense from the perspective of the data and users. OSM's
>> native format and postgis schema are not designed for rendering (nor is
>> its XML/PDF dump format, aka the "planet file") and there is a wide
>> variety of conversion and import tools for filtering it, turning
>> nodes/ways into OGC geometries, and otherwise prepping parts of it for
>> display. Advances made by benchmarking teams to think of great ways to
>> utilize and optimize OSM data for rendering will benefit all the many
>> consumers of OSM data.
>> 
>> But my concern with this plan is that we all need to recognize the time and
>> effort of this approach.
>> 
>> I would assume that the various teams that seek to participate in this
>> exercise did not sign up to write OSM -> {some format} conversion script
>> and this part of the exercise could end up taking a large proportion of
>> the effort if teams see that gains can be had by
>> filtering/simplifying/partitioning or otherwise optimizing during import
>> rather than during rendering. I saw in the notes that no "simplification"
>> would be allowed, but this is unrealistic because even the osm2pgsql tool
>> used by openstreemap.org to import into postgis simplifies some geometries
>> and puts them in low-zoom table called "planet_osm_roads". This is a good
>> thing of course and osm2pgsql should be doing more of it. The problem
>> however, is how we keep our results comparable if some tools simplify more
>> than others or otherwise throw out data that other tools do not.
>> 
>> So, I worry the plan for only doing "best effort" (vs all of us deciding on
>> a shared way of processing and storing OSM data to be used for rendering)
>> is dodging a key decision of how to plan a meaningful baseline test. So, I
>> think we should both as I have mentioned above, and realistically only
>> once a baseline test is in place will best effort tests seem
>> reasonable._______________________________________________ Benchmarking
>> mailing list
>> Benchmarking at lists.osgeo.org
>> http://lists.osgeo.org/mailman/listinfo/benchmarking
> 



More information about the Benchmarking mailing list