<div dir="ltr">Warning - Long post<div><br></div><div>I have created a function that finds the minimum spanning tree from a MultiLineString per the description at:</div><div><a href="https://docs.google.com/presentation/d/1iqTLwt95rEBISBcPoaA31_clqQBAJlaMegHNfNYHuuI/edit#slide=id.g6c6fcfd43d_0_85">https://docs.google.com/presentation/d/1iqTLwt95rEBISBcPoaA31_clqQBAJlaMegHNfNYHuuI/edit#slide=id.g6c6fcfd43d_0_85</a><br></div><div>I have it working in two parts but I can't merge them together into one function.</div><div><br></div><div>First the data. I created a table wiki with just a geometry column. Then I grabbed the SVG coordinates from the Wiki link from above, tweaked it and got the following MultiLineString:</div><div><br></div><div>MULTILINESTRING((12.927 149.924,50.261 222.857),(12.927 149.924,122.094 70.04),(12.927 149.924,49.247 169.665),(49.26 169.665,178.334 165.312),(50.01 222.79,49.26 169.665),(64.208 197.562,173.254 269.374),(64.355 197.598,50.01 222.455),(64.355 197.598,178.499 165.263),(64.602 197.598,49.26 169.665),(122.093 70.04,49.26 169.665),(173.254 269.374,50.01 222.79),(173.254 269.374,232.314 286.445),(173.427 269.374,242.527 269.556),(178.427 165.263,242.527 269.556),(178.499 165.263,122.093 70.04),(178.499 165.263,173.254 269.374),(232.314 286.445,242.527 269.556),(232.314 286.437,285.906 259.373),(242.527 269.556,285.906 259.236),(285.906 259.373,122.026 70.04),(285.906 259.373,178.427 165.263))<br></div><div><br></div><div>I then inserted it into the table wiki. Note the vertices do not perfectly line up..</div><div><br></div><div>Step one - I need to create a table that defines the geometry,starting and ending points of each line segment and the line length. This is easy:</div><div><br></div><div>create table mst_tree as (<br> WITH t1 as (SELECT (ST_Dump(geom)).geom AS geom from wiki),<br> t2 as(select st_snaptogrid(geom,10) geom from t1)<br> select geom, st_startpoint(geom) as src, st_endpoint(geom) as dest, st_length(geom) as weight from t2<br>)<br></div><div><br></div><div>Note the value of 10 in the st_snaptogrid - this assures all the close nodes snap to each other and will need to be a parameter to the function.</div><div><br></div><div>Next is the function that finds the Minimum Spanning Tree. I used the Kruskal method - see:</div><div><a href="https://www.geeksforgeeks.org/kruskals-minimum-spanning-tree-algorithm-greedy-algo-2/">https://www.geeksforgeeks.org/kruskals-minimum-spanning-tree-algorithm-greedy-algo-2/</a><br></div><div><br></div><div>The full function is:</div><div>---------------------------------------------------------------</div><div><br></div><div>-- FUNCTION: public.mst()<br><br>-- DROP FUNCTION public.mst();<br><br>CREATE OR REPLACE FUNCTION public.mst(<br> -- tree geometry,<br> -- tolerance float<br> )<br> RETURNS TABLE(mst_geom geometry, mst_weight double precision) <br> LANGUAGE 'plpgsql'<br> COST 100<br> VOLATILE PARALLEL UNSAFE<br> ROWS 1000<br><br>AS $BODY$<br>DECLARE<br> edge RECORD;<br> e integer;<br> i integer := 0 ;<br>BEGIN<br> EXECUTE 'drop table if exists mst_edges';<br> EXECUTE 'drop table if exists mst_visited';<br><br>-- I am stuck here<br> --EXECUTE 'drop table if exists mst_tree';<br> --EXECUTE 'create table mst_tree as (WITH t1 as (SELECT (ST_Dump(tree)) AS geom ),<br> --t2 as(select st_snaptogrid(geom,tolerance) geom from t1)<br> --select geom, st_startpoint(geom) as src, st_endpoint(geom) as dest, st_length(geom) as weight from t2)';<br> <br> EXECUTE 'create table mst_visited as (select src as node, 1 as branch from mst_tree limit 0)';<br> EXECUTE 'create table mst_edges as (select geom, weight, src, dest, 1 as mst from mst_tree limit 0)';<br><br> with nodes as (<br> select src as node from mst_edges<br> union all <br> select dest as node from mst_edges<br> ),<br> d_nodes as (<br> select distinct node from nodes)<br> SELECT COUNT(node) from d_nodes into e;<br><br>for edge in execute 'SELECT geom, weight, src, dest FROM mst_tree ORDER by weight asc' <br> LOOP<br> EXIT WHEN e = 1;<br><br> WITH next_nodes as (<br> select edge.src as node<br> union all <br> select edge.dest as node<br> ), -- put nodes in a single column<br><br> new_nodes as (<br> select node from next_nodes where node not in (select node from mst_visited)<br> ), -- list of nodes not yet visited<br><br> existing_nodes as (<br> select next_nodes.node, branch from next_nodes, mst_visited<br> where next_nodes.node = mst_visited.node<br> ), -- list of nodes already visited<br><br> cycle_test as (<br> select <br> case<br> when count(node) = 2 and (max(branch) = min(branch)) then 2 -- forms a loop, do not add to MST<br> else 1 end as mst, -- not a loop, include edge in MST<br> max(branch) as max_branch, min(branch) as min_branch<br> from existing_nodes<br> ),<br> <br> v (nodes) as (<br> insert into mst_visited (node,branch) <br> (select node,<br> case<br> when (select count(node) from new_nodes) = 2 then e --next_branch.n_b -- two new nodes, new branch<br> else (select branch from existing_nodes) -- add node to existing branch<br> end<br> as branch<br> from new_nodes--,next_branch<br> )<br> returning *<br> ),<br> <br> v_c(node) as (<br> update mst_visited set branch = d.min_branch<br> from (select * from cycle_test) d<br> where mst_visited.branch = d.max_branch<br> returning *<br> ), -- if two branches, combine branches (no effect if on same branch) This is the critical step to loop detection<br><br>-- update edge table, set mst to 1 or 2 <br> u (node) as (<br> insert into mst_edges (geom,weight,src,dest,mst) <br> (select edge.geom,edge.weight, edge.src, edge.dest, mst from cycle_test<br> where mst = 1)<br> returning *<br> )<br><br> select mst from cycle_test into i;<br><br> IF (i = 1) then<br> e = e - 1;<br> END IF;<br> END LOOP;<br><br> EXECUTE 'drop table mst_visited';<br> RETURN QUERY SELECT geom,weight from mst_edges;<br><br>END;<br>$BODY$;<br><br>ALTER FUNCTION public.mst()<br> OWNER TO postgres;<br></div><div>-----------------------------------------------</div><div><br></div><div>If you create the table mst_tree from the first part and then run:</div><div><br></div><div>SELECT * from mst()</div><div><br></div><div> you will get a table of each segment in the MST and the associated weight (length) . If you want to get the result in a MultiLine segment:</div><div><br></div><div>SELECT ST_Collect(ARRAY(SELECT mst_geom FROM mst6()));<br></div><div><br></div><div>Now I am stuck. How do I pass the geometry and tolerance into the function If I uncomment the lines I get the error:</div><div><br></div><div><span style="font-family:"Source Code Pro",SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;font-size:12.95px;white-space:pre-wrap">ERROR: column "geom" does not exist
LINE 1: ...te table mst_tree as (WITH t1 as (SELECT (ST_Dump(geom)).geo...
</span><span style="font-family:"Source Code Pro",SFMono-Regular,Menlo,Monaco,Consolas,"Liberation Mono","Courier New",monospace;font-size:12.95px;white-space:pre-wrap">QUERY: create table mst_tree as (WITH t1 as (SELECT (ST_Dump(geom)).geom AS geom),
t2 as(select st_snaptogrid(geom,$2) geom from t1)
select geom, st_startpoint(geom) as src, st_endpoint(geom) as dest, st_length(geom) as weight from t2)
</span><br class="gmail-Apple-interchange-newline"></div><div>Any help will be appreciated</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div></div>