[postgis-devel] Distance2D Review

Paul Ramsey pramsey at cleverelephant.ca
Mon Nov 9 16:03:52 PST 2009


Hi Niklas, I'm just reading through the code, and I'm going to note
things into this e-mail as I go along:

- Try and keep private structures out of the liblwgeom.h file. So far
I see BBOXDIST, W_BBOX, LISTSTRUCT which are entirely confined to
measures.c. Put them into ./liblwgeom/measures.h for now (we'll have a
go at rationalizing public and private prototypes more attractively in
the PostGIS 2.0 round).
- Follow the code formatting guidelines in ./STYLE please (you can
retroactively apply them with astyle though)
- I see no test cases in cu_measures.c for your functions, we're
trying to get as much testing as possible in the library level unit
tests as we go along, so for new functions in particular we should aim
to get coverage
- This construction is a little odd:

	DISTPTS thedl;
	DISTPTS *dl = &thedl;
	lw_dist2d_comp( lw1,lw2,dl);

  Generally people just write it this way, without the extra variable,
it has the same effect:

	lw_dist2d_comp( lw1,lw2, &thedl);

 - You've declared move of your working functions with a void return
(since the return values are actually in the *DISTPTS). I'm not sure
if you have failure cases checked deeper in your functions yet, but
one way to propagate up error cases without things like null-pointer
checks is to return LW_TRUE or LW_FALSE from the functions so that
higher level error handling code can do the "right thing" with them.
- Unfortunately, you've followed the coding style of the original
functions, which is working more-or-less directly with serialized
*uchar. So, for convenience, you've defined things like W_BBOX to have
easier access to the bounds without having to re-pull them from the
structure each time. If you de-serilized to LWGEOM early on and worked
with LWGEOM, things might be less ugly.
- *Also*, in PostGIS 2.0 I am going to be getting rid of all these
*uchar valued calculation functions and replacing them with code that
works on the LWGEOM structures. So you can future-proof your code by
setting it up that way from the start...
- For your domain-specific structs, prefixing them so they are all
commonly named would be nice: DIST2D_RESULTS, DIST2D_LIST, etc.
- In DISTPTS, you have a member, 'thedir' which should really be named
'mode' and valued with a macro or enumeration (DIST2D_MAX, DIST2D_MIN)
rather than magic numbers.
- In DISTPTS, you have a member 'd' which should be 'dist' or 'distance'.
- I'm not enamoured of the down-the-rabbit-hole style in which
functions seem to delegate to function2 variants here and there for no
clear reason. Document what happens in each section of your function,
great new function points for code that is called repeatedly or in
different contexts.
- Mind you, the existing style in measures.c is really ugly to start
with, so c'est la vie.
* - I would really like to see two SQL bindings, an ST_Distance() that
calls through your proposed stack and an ST_Distance_Old() that calls
directly into the brute force functions. That way during the testing
phase we can compare the methods for accuracy on *lots* of different
data easily. We can remove those SQL bindings before release time.
- Things might fall apart if you are fed EMPTY geometries. I see some
places where npoints is not checked for > 0 on point arrays (or, that
pointarrays aren't checked for non-null on deserialization).

The most important thing to me is (*) above: enabling the ability to
test correctness on a large range of data during the 1.5 pre-release
period.

Hopefully Mark can take a quick look at the code this week, but since
it's going to be re-written in 2.0 to some extent anyways, the
stylistic stuff is less important to me than it might otherwise be.

Paul



More information about the postgis-devel mailing list