<div dir="ltr">Hello,<div><br></div><div>I am using the PDAL python bindings to read point clouds into geopandas GeoDataFrames and then write them back to disc.</div><div><br></div><div>The approach I have works, but is slow. Is there a faster way of doing this?</div><div><br></div><div>The slowest part is converting from a GeoDataFrame back to a structured numpy array.</div><div><br></div><div>Code below.</div><div><br></div><div>Thanks,</div><div><br></div><div>James</div><div><br></div><div><font face="monospace">import pdal<br>import numpy as np<br>import pandas as pd<br>import geopandas as gpd<br><br>input_point_cloud_filepath = "..."<br><br>output_point_cloud_filepath = "..."<br><br>crs = "..."<br><br>#################### Read PC into Memory and convert to GeoDataFrame ####################<br><br>pipeline_stages = [<br> pdal.Reader.copc(input_point_cloud_filepath),<br> pdal.Filter.hag_nn()<br>]<br><br>pipeline = pdal.Pipeline(pipeline_stages)<br><br>_ = pipeline.execute()<br><br>pointcloud_dtype = pipeline.arrays[0].dtype<br><br>pointcloud_df = pipeline.get_dataframe(0)<br><br>pointcloud_gdf = gpd.GeoDataFrame(<br> pointcloud_df,<br> geometry=gpd.points_from_xy(pointcloud_df["X"], pointcloud_df["Y"], pointcloud_df["Z"]),<br> crs=</font>
<span style="font-family:monospace">crs</span> <font face="monospace">,<br>)<br><br>_ = pointcloud_gdf.sindex<br><br>########################## Manipulate PC via geopandas & numpy ##########################<br><br># Do stuff here<br><br><br>########################### GeoDataFrame -> numpy -> pipeline ###########################<br><br>pointcloud_arr = np.array(<br> (<br> pointcloud_gdf<br> .drop(columns=["geometry"])<br> .apply(tuple, axis=1)<br> ),<br> dtype=pointcloud_dtype<br>)<br><br>output_pipeline = pdal.Filter.stats().pipeline(pointcloud_arr)<br><br>output_pipeline |= pdal.Writer.copc(<br> output_point_cloud_filepath,<br> forward="all",<br> a_srs=crs,<br>)<br><br>_ = output_pipeline.execute()</font><br><br></div></div>