[GRASS-user] v.select incredibly slow

Moritz Lennert mlennert at club.worldonline.be
Mon Mar 29 07:21:33 PDT 2021


Hi Uwe,

On 29/03/21 15:59, Uwe Fischer wrote:
> Hi Moritz,
> 
> for the question about lines and polys: I believe that confusion may come up when one looks at a GRASS dataset in the QGIS browser panel (please see attachment): there are lines and polygons in one map! And I can load them exactly that way in QGIS. That is: when I load the lines in QGIS, I only get lines!

GRASS GIS' vector format supports mixing all types of features in one 
single map, so you can have points, lines and areas (boundary lines + 
centroids) in one single map. The shapefile format does not support 
this. AFAIK, QGIS follows the shapefile logic and does not allow 
different geometry types in one single layer. Any area map in GRASS GIS 
can actually be represented in three ways:

1) As what we generally would call polygons
2) The lines that constitute the contours of these polygons (called 
boundaries in GRASS GIS)
3) Points that lie within the polygon (and are called centroids in GRASS 
GIS, although they are not necessarily actual centroids in the 
geometrical sense)

QGIS only provides 1 & 2.

You can see the effect of this format by playing around with 'type' 
option in d.vect, directly in GRASS GIS.

> 
> And they ARE lines. That means: when I select a linear feature that separates two areas with the mouse, I select only one feature! (attachment)
> Were they boundaries, the mouse would grab two features, one for the left and one for the right area??

Again, in GRASS GIS boundaries are lines, not polygons. And in QGIS they 
are displayed as simple lines.

Because of its topological data format GRASS GIS does not have the 
notion of polygons as such. It has the notion of areas which are the 
combination of a boundary line with a centroid. Boundary lines without 
centroids are not considered areas aka polygons.

AFAIK, QGIS does not support topological formats as such and so the 
topological vector format of GRASS GIS (with points, centroids, lines, 
boundaries, (virtual) areas) is mapped into simple features in QGIS: 
points, lines, polygons. If you want to make use of the GRASS GIS' 
topological format directly, you will have to use GRASS GIS.

> 
> Maybe it has to do with the way the dataset was imported: it came via v.in.ogr using CAD data made up of lines and points using type='centroid,boundary'.
> 
> On the other hand, GRASS v.info for that same map gives me:
> 
> v.info -t map=forst_f_035980 at lwk_work
> nodes=86
> points=0
> lines=0
> boundaries=109
> centroids=42
> areas=43
> islands=20
> primitives=151
> map3d=0
> 
> No line features at all !
 >
 > So is the QGIS representation misleading? How can QGIS see lines 
while GRASS does not?
 >

Boundaries are lines which have a special status. But geometrically they 
are lines, not polygones.


> On the other hand, I learned from your example that v.select can use boundaries as linear features. I checked the same for v.buffer and found that it works: for type=boundary, v.buffer will put out tubular buffers around linear features (it will not buffer the areas as a whole!) I did not expect that, because I thougt that buffering boundarys gives me area buffers, since the boundaries are the area borders.

Creating buffers around "areas" will give you area buffers, creating 
buffers around boundaries will give you exactly that: buffers around the 
lines that are defined as boundary lines.

Have you had the opportunity to read through 
https://grass.osgeo.org/grass78/manuals/vectorintro.html#vector-model-and-topology 
?

> 
> And for the data conversion tasks: such tasks come up in my projects from time to time.
> You are right, there are only lines or only polygons in a shapefile.
> But I sometimes need to perform typical polygon tasks first (like selecting by attributes or dissolving or buffering) and then line tasks on the same dataset (like retrieving the clean lines, broken up and without duplicates or other errors for exporting and further processing in CAD). The second part cannot be done with boundaries, right? That is why i was looking for a good way to deal with both.

Most GRASS GIS vector commands allow you to choose which aspect of the 
vector map you want to work with, generally through a 'type' parameter. 
This allows to do what I showed you with v.select, but also with 
v.buffer, v.clean, etc. When you change boundaries, however, you have to 
be aware that you might change them in a way that you break topology. So 
some additional care might be necessary.


Moritz

> 
> 
> -----Ursprüngliche Nachricht-----
> Von: Moritz Lennert [mailto:mlennert at club.worldonline.be]
> Gesendet: Montag, 29. März 2021 12:01
> An: Uwe Fischer <gisfisch at t-online.de>; grass-user at lists.osgeo.org
> Betreff: Re: [GRASS-user] v.select incredibly slow
> 
> Hi Uwe,
> 
> On 29/03/21 09:53, Uwe Fischer wrote:
>> But it led me again to some kind of misunderstanding that I cannot figure out:
>>
>> My data are imported from polygon shapefiles.
>>
>> First question: using v.in.ogr, what does the "type=" parameter mean exactly? In the manual, it reads: "Optionally change default input type". But imho, the input is the input. You cannot change it. What can be changed is the output or the way you process the input. This question prevents me from really understanding what v.in.ogr does with my polygons.
> 
> You probably do not need to use this parameter. It allows you to transform specific data to another type. A classical example would be to import area centroids as points, not centroids:
> 
> v.in.ogr census_wake2000.gpkg out=cw_noType v.in.ogr census_wake2000.gpkg out=cw_point type=point
> 
> v.info -t cw_noType
> nodes=192
> points=0
> lines=0
> boundaries=296
> centroids=105
> areas=105
> islands=1
> primitives=401
> map3d=0
> 
> v.info -t cw_point
> nodes=192
> points=105
> lines=0
> boundaries=296
> centroids=0
> areas=105
> islands=1
> primitives=401
> map3d=0
> 
> Even though v.info indicates a certain number of areas, as centroids=0 in cw_point, you will not have complete areas as in GRASS GIS an area is defined as the combination of boundaries and centroids.
> 
> But as mentioned, this is for very specific uses.
> 
>> Second question: I thought a GRASS map is able to hold areas and lines together in one map at the same time. How can I achieve such a mixed map using v.in.ogr from my polygon Shapes?
> 
> As far as I know, a shapefile cannot contain both lines and polygons, so are you sure you want to import both from the same file ? Are the lines you want to import the boundaries of the polygons ?
> 
>> When I use it with "type=line", it will produce lines only, some of
>> which are holding former area attributes (which makes no sense for
>> lines)
> 
> Attribute data is imported into a table, but AFAICT, there is no link between the lines and the attribute data. I guess it was decided to not lose the attribute data and so import it into a table. If you don't want to import the attribute table, you can use the -t flag.
> 
> An important aspect is that lines that imported this way do not have category values. So when you run v.select on these lines you will have to indicate that you do not want v.select to skip features without categories (-c flag).
> 
>> When I use it with "type=line, boundary", it will also produce lines only.
>> Using "type=centroid, boundary" makes no sense because the input polygon shapefile has only polygons, but no centroids.
> 
> You have to think in GRASS GIS terms, and its topological data model, to understand this. I suggest reading https://grass.osgeo.org/grass78/manuals/vectorintro.html#vector-model-and-topology
> to get an overview.
> 
>> Maybe I have to go another way?
> 
> You could probably not worry about the question of lines vs polygons at the moment of import. Just import all polygons as polygons. You can then decide to check for boundary lines at the v.select stage. Here's an example using the NC demo data set:
> 
> # select polyons of layer A that are within polygons of layer B v.select ainput=census_wake2000 at PERMANENT binput=boundary_municp at PERMANENT output=census_select operator=within
> 
> # select boundary lines of layer A that are withing polygons of layer B v.select -c ainput=census_wake2000 at PERMANENT atype=boundary binput=boundary_municp at PERMANENT output=census_select_lines operator=within
> 
> [Note the use of -c as boundary lines normally do not have category values as these are attached to the centroids for areas.]
> 
> Attached you can see a quick map showing the results: as you can see the red lines selected go beyond the yellow polygons selected.
> 
> Moritz
> 
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Moritz Lennert [mailto:mlennert at club.worldonline.be]
>> Gesendet: Sonntag, 28. März 2021 13:50
>> An: grass-user at lists.osgeo.org; Uwe Fischer <gisfisch at t-online.de>
>> Betreff: Re: [GRASS-user] v.select incredibly slow
>>
>> Hi Uwe,
>>
>> Am 27. März 2021 14:58:01 MEZ schrieb Uwe Fischer <gisfisch at t-online.de>:
>>> Hello list,
>>>
>>> I have trouble selecting line features using their location compared
>>> to a polygon layer using v.select. The line features I want to select
>>> from are parcel borders, and the polygon layer is made up of
>>> tubular-shaped buffers around municipal borders. I need to find the
>>> parcel borders which are inside this buffer.
>>>
>>> The command line in a Python script I use is:
>>>
>>> grass.run_command('v.select', ainput='temp5', atype='line',
>>> binput='buff', blayer=1, btype='area', output='grenz',
>>> operator='within', overwrite=True)
>>>
>>> The process starts, but it runs incredibly slow (> 15 min) and it
>>> brings not the desired result (but trash data). When I start it in
>>> the GRASS ui, it also works very very slow.
>>>
>>> I have only about 2000 parcel borders, so it cannot be a problem of
>>> too much features. Furthermore, the exact same selection task is
>>> processed in QGIS 3 in a second with perfect results.
>>>
>>> I used v.build on both maps before v.select, but it does not help.
>>>
>>> I would like to perform it in GRASS because it is part of a bigger
>>> data preparation script which makes my work a lot easier. So I need
>>> to integrate it here rather than selecting in QGIS manually.
>>
>>
>> First of all: which version of GRASS GIS are you using ?
>>
>> I filed a bug about this same issue a few years ago [1] and Markus Metz reorganized the code at the time to speed things up. I don't remember which version was the first to include the fixes (7.6 ?). However, even though it was slow, results were ok which doesn't seem to be the case for you, so that is a bit worrying.
>>
>> Can you reproduce the same issue with the example given in that bug report ? If not can you provide a reproducible example, including relevant data ? Ideally as a GitHub issue ?
>>
>> As a workaround you could try either the alternative provided in the bug report, or you  could try to reduce the number of line candidates first using v.select operator=overlap and using operator=within only on those selected in the first call.
>>
>> Moritz
>>
>> [1] https://trac.osgeo.org/grass/ticket/3361
>>
>> _______________________________________________
>> grass-user mailing list
>> grass-user at lists.osgeo.org
>> https://lists.osgeo.org/mailman/listinfo/grass-user
>>
> 



More information about the grass-user mailing list