[Java-collab] First contents in osgeom repository

Jody Garnett jody.garnett at gmail.com
Sun Aug 9 20:55:06 EDT 2009


Hi Markus:

Can we get a bit more direction on what feedback; like/dislike you
would expect at this stage? Without running code it is a bit hard for
me to give grounded feedback - and I am hesitant to comment on a lot
of things without test cases and experimentation to verify :-)

I can think of a couple of things to do in order to review these
interfaces. Check them against GML, check them against ISO19107 etc..

I have provided a few comments inline where I would I would really
like to assist.

> -- Geometry and geometry related types --
>
> The interfaces provide a faithful representation for all kinds of
> geometry elements that are allowed according to GML 3.1.1 (except for
> the two elements gml:_ImplicitGeometry + gml:Grid which are still
> missing). Note that the geometry elements in GML 3.2.1 are a subset of
> GML 3.1.1 (I think they basically deprecated gml:MultiPolygon and
> gml:MultiCurve and removed gml:Ring from the geometry substitution
> group), which is why the interface design refers to 3.1.1 rather than
> 3.2.1 (comments?).

Interesting. In general fewer classes are better then more; it may be
worth providing a factory that is willing to create a "ring" or
"multicruve" and produce the correct thing. This would give us a
chance to document how to represent these constructs.

Although not explicitly mentioned (because we are at interfaces right
now) it is important to me to separate out construction from
validation. This was one of the main lessons learned from our earlier
work.

> - 27 Geometry interfaces: A hierarchy of interfaces that corresponds to
> the whole substitution group hierarchy gml:_Geometry. Additionally a
> class that corresponds to gml:Envelope. More details below.
> - 16 CurveSegment interfaces (package
> org.osgeo.geometry.primitive.segments): Interfaces that allow to
> represent any segments that curves (i.e. gml:Curve elements) may consist
> of. Corresponds to the substitution group hierarchy gml:_CurveSegment.
> - 8 SurfacePatch interfaces (package
> org.osgeo.geometry.primitive.patches): Interfaces to represent the
> patches that surfaces (i.e. gml:Surface elements) consist of.
> Corresponds to the substitution group hierarchy gml:_SurfacePatch.
> - 1 Points interface (org.osgeo.geometry.points.Points) that abstracts a
> sequence of Point objects. The motivation for this interface is to allow
> more compact and efficient representations than using lists or arrays of
> Point objects. This is essential, as geometries are usually build from
> many points, e.g. a detailed LineString consists of tens or hundreds
> thousands of points.  Think: CoordinateSequence in JTS.

I would echo the importance of this (our second lesson learned!) -
note that this directly matches the ISO 19107 construct PointArray.

Although not mentioned in your list (and I don't have the code to
check at work here). I would like to comment on Point:
- I want to be sure we don't have getX() and getY(). The have caused
so much confusion in letting people assume x/y even when their
coordinate system is radial; or happens to be in y/x order etc... I
would like to remove these methods and just have get( int
)/getOrdiante( int ) etc...
- How to handle a Point as a Geometry? Or as a internal construct?
This is the difference between treating it as a Geometry or treating
it in the same manner as CurveSegment. We had a horrible confusion in
GeoAPI with a Point (ie the Geometry) and a DirectPosition/Position
(the internal construct which was set up to allow a Point to be shared
between different Geometry and mix and match both uses in a single
Curve). While the aim was noble - I would like to make sure we avoid
any confusion :-) With this in mind I would like to make the following
bold suggestion:
a) make Point a Geometry (it appears to be GeometricPrimitve that
extends Geometry
b) make Points the internal construct (so even a Point would have an
internal Points consisting of a single entry)
c) add methods to Points so we can test if a member is a 1st class
Point (ie shared with another Geometry)
Now the above suggestion may be terrible; it is just placed here to
get conversation going.

> - 1 CRS interface (org.osgeo.commons.crs.CRS): Placeholder interface
> that allows to represent CRS information.
> - 4 UOM related interfaces (org.osgeo.commons.uom): These are included
> so some methods (e.g. Geometry#isWithinDistance(...)) can be expressed
> in a unit-aware fashion. Comments?

Can we just decide to use JSR-275 now that it actually works?

> - 1 ObjectProperties interface (org.osgeo.commons.ObjectProperties):
> Placeholder interface for properties that can be attached to Geometry
> objects. One important use case are the GML standard properties that
> each GML geometry object allows for (e.g. multiple gml:metaDataProperty
> elements, gml:name and gml:description properties).

I assume these are going to be lazily created as needed in order to
avoid the memory overhead?

> - 1 PrecisionModel interface
> (org.osgeo.geometry.precision.PrecisionModel): Placeholder interface for
> a precision definition that can be attached to geometries. It's not 100%
> clear to me yet how this should be defined and how to interoperate with
> topological operators, spatial analysis and the exporting of geometries.

They amount to "rounding" with a bit of metadata to tell what kind of
numeric field we are working ontop of (floating point; or fixed
point). This information would really like to be stored with the CRS;
indeed if I was in a greenfield situation I would record this
information for each Axis. When you are implementing an operation you
use the precision model to round the answer as you go; when you have
performed "enough" recursion your rounded answer is the same as the
rounded answer of the previous iteration. This sanity check is used to
avoid doing more work then needed (ie why calculate to twenty decimal
places if the WFS output format is only going to print out eight
significant digits? It is also used to ensure that the resulting
geometry is actually valid at the indicated precision - it is easy to
make a geometry that is valid; and then serve it up with the correct
precision and have it be invalid. Usual example is a line string that
when served up has both the start and end point in the same location.
This concept also effects intersection; where the lines are produced.
Internally it can notice that the start and end point are the same;
and provide a Point as a result rather then a LineString.

> - 2 beans that represent parameter sets needed for the definition of
> certain curve segments:
> org.osgeo.geometry.primitive.segments.AffinePlacement and
> org.osgeo.geometry.primitive.Knot. I think that it doesn't make sense to
> allow different implementations here, and therefore these are classes,
> not interfaces.

Small comment:
- KnotType is an enum - we may need an open set of types here? Rather
then a closed set up enumerations. Do you have anything similar to a
CodeList? Basically before Java 5 Enum provided syntactic sugar it was
always possible to set up an object like an Enum; and when you do it
yourself you have the option of adding more constants over time.
- Just for consistency refactor KnotType into an internal class (ie
Knot.Type) in order not to pollute the file system with what is an
detail of the Knot class.
- Geometry.GeometryType can also be changed to Geometry.Type - and
should also be an open ended set?
- MultiGeometry.MultiGeometryType?
- Curve.CurveType? especially this one
- Primitive.PrimitiveType - this one is especially interesting as it
should be the D in your DIM-9 matrix; as such I would be tempted to
supply my own ordinal values here in order to make any relate code
much easier to read.

For many of these "types" I am wondering why they are required as they
seem to duplicate information already present in the class hierarchy?
Example Geometry.GeometryType.ENVELOPE is duplicated by the Envelope
class is it not?

> --- Notes on the geometry types ---
>
> Altough it deviates from the GML 3.1.1 geometry hierarchy, Envelope is
> currently derived from Geometry (interestingly, in GML 3.0, gml:Envelope
> was actually substitutable for gml:_Geometry). This was done for
> pragmatic reasons, as we observed that one often has to deal with a
> Geometry or an Envelope (e.g. in Filters). By making them derive a
> common interface, such cases are simpler to be implemented/used. If we
> don't have such a common type, many, many methods have to be defined and
> implemented twice (e.g. Geometry#intersects(Geometry) and
> Geometry#intersects(Envelope)). Alternatively, we could create a common
> base type above Geometry/Envelope (e.g. GeomOrEnvelope). What do you think?

- getWidth() and getHeight() shoudl be replaced with a getSpan( int )
to avoid any CRS based confusion
- having it as a Geometry is fine; may wish a toSurface() method to
help people duck out of corner cases? Actually a factory method that
will turn a Envelope into a Surface would be smarter and less of an
API Burden?

> The Geometry interface currently specifies the following methods:
>
> - The common batch of topological predicates (e.g.
> #intersects(Geometry)). IMHO, these methods should be CRS aware, so a
> geometry in EPSG:25832 may be checked for intersection against an
> EPSG:4326 geometry. Coordinate transformation should be performed
> on-the-fly. Comments?

That is tough; an ease of use vs explicit tradeoff...
- For cases where a new Geometry is produced I would recommend asking
people to do the transformation themselves? That way it is very
obvious what CRS the result is in. If not we would get order dependent
results which would drive me a bit baty?
- If we ask people to do that for getUnion() then I would recommend
asking them to do the same steps for intersects for consistency.
- The methods are not consistent: isBeyond follows java bean naming;
intersects does not. It is also odd for me to see getBuffer and
getUnion follow java bean naming conventions when these things are not
infact properties. I would prefer to see:
- use java bean naming for the actual data model ... getId(),
getCoordinateSystem()
- use raw method names for operations and calcualtions: buffer(
measure ), centroid(), intersects( geometry ), touches( geometry ),
transform( CRS )
- I would be open to a naming convention: toGeometry( CRS crs ) -
rather than transform; toCentroid(); queryTouches( Geometry ) etc...
but only if it adds something worth typing extra characters for

> - The common batch of spatial analysis operations (e.g.
> #getIntersection(Geometry) or getConvexHull()). How should one treat
> different CRS here?
> - #getGeometryType() provides a way to differentiate the type of a
> geometry object. Note that there are comparable methods on the deeper
> levels of the hierarchy (e.g. Curve#getCurveType()). This was motivated
> by the idea to ease the working with the (quite complex) hierarchy, for
> example in drawing code. I am not a big fan of using instanceof tests
> (as they are inherently unsafe), so I thought it would be nice to
> provide a way to use the more safe and convenient switch statements for
> this (safer,  as Eclipse creates warnings if you leave out any choices).
> What do you think?

Comments about this are above; I understand you are not found of
instance of tests; this would be a good use of a utility class I
guess? Even with this in mind you can make a getType() method return a
Class and use a swtich statement on the class with no loss of clarity.

> - #getId() / #setId(). As ids are very integral to GML, I provided
> methods for working with the identifier.

We have been forced to make up or own Identifier - in response to
GMLObjectIdentifier in order to handle namespace issues. Will a simple
string be sufficient here?

> - #getCoordinateSystem() / #setCoordinateSystem(...)

I will point out that getCoordinateSystem is not actually a coordinate
system (ie Cartesian, radial). Can we shorten this to getCRS()?

> - #getAttachedProperties() / #setAttachedProperties(...) for coping with
> the GML standard properties that each GML geometry object allows for
> (e.g. multiple gml:metaDataProperty elements, gml:name and
> gml:description properties).

Commented above about lazy creation; I would consider leaving this as
an implementation details (ie the creation of an internal Properties
object and/or Map) and adding methods for getName(), getDescription()
etc... although it depends on how many of these we expect to ahve?

> Eagerly awaiting your feedback. Please let me know what you like / dislike.

Hopefully the above is some feedback to start things off? I would like
to define the factory API as I find it hard to image how these
interfaces are used on their own. I notice that the gvSig
representative recommended the split between class and interface.
After years of watching JTS succeed I wonder if we should consider
forgoing this split? We could simply define classes for these data
structures; as long as we have separated out the operation
implementation we could do this with no loss of flexibility; and the
resulting Java library would be *much* easier for people to use. There
are many downsides of course - Andrea beloved Points / PointArray /
CoordianteSequence construct is very important - I believe we would
need an interface in this case.

Cheers,
Jody


More information about the Java-collab mailing list