[postgis-users] Dividing a point set into groups by distance

BladeOfLight16 bladeoflight16 at gmail.com
Sun Nov 10 17:50:24 PST 2013


On Sun, Nov 10, 2013 at 8:25 AM, Carsten Hogertz
<carsten.hogertz at gmail.com>wrote:

> Hello,
>
> I've got the following task to solve:
>
> There's a table containing a set of 100 site points spread over a city.
> These 100 points have to be visited 10 groups where each group has to
> visit 10 sites.
>
> My question is how can I divide this 100 points into 10 groups where the
> points of each group are grouped by nearest distance?
>

You're trying to find groups of 10 where you minimize distance between all
points in each group. I suspect it's an NP-hard problem. I seem to recall a
similar problem being NP-hard, although I couldn't identify it off-hand.
You might be able to glean something by looking into other NP-hard
problems. In these class of problems, something you need to determine is
what constitutes "good enough." Unless you can find an existing solution to
a similar problem, your best bet is to accept a suboptimal solution that
provides "good enough" results.

If you're willing to accept near-optimal results instead of truly optimal,
I have an idea you could try out. Take the bounding box of all the points.
Then use a binary-search-like algorithm to find a vertical or horizontal
line that splits the box into two boxes each containing 50 points. (By
"binary-search-like", I mean start with the center line, and then check how
many points each box contains, and then move your line into the half that
contains more points. Keep doing this until you find a line that works.)
Then do something similar that splits the new divided boxes into 2 more
boxes where one contains 10*CEIL((n/10)/2). (That's the multiple of 10
nearest to half the number of points the box contains.) Keep doing that
until you end up with a set of boxes that contain 10 points.

There are a couple of details to worry about here. You'll have to worry
about having a bunch of points being colinear with your dividing line. That
could pose a problem finding a good dividing line. You should also have
something in place to choose whether to use a horizontal or vertical line.
My first guess about doing that would be to use horizontal if the box is
taller than it is wide; otherwise, use vertical. That would keep you from
ending up with a bunch of really thin and long boxes. There's probably
other issues that haven't occurred to me, too.

Basically, this is gonna be tough to work out. Good luck.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osgeo.org/pipermail/postgis-users/attachments/20131110/8422e854/attachment.html>


More information about the postgis-users mailing list