pyfor package


pyfor.clip_funcs module

pyfor.clip_funcs.poly_clip(cloud, geometry)
pyfor.clip_funcs.ray_trace(x, y, poly)

A numba implementation of the ray tracing algorithm.

  • x – A 1D numpy array of x coordinates.
  • y – A 1D numpy array of y coordinates.
  • poly – The coordinates of a polygon as a numpy array (i.e. from geo_json[‘coordinates’]

pyfor.clip_funcs.square_clip(cloud, bounds)

Clips a square from a tuple describing the position of the square.

Parameters:las_xy – A N x 2 numpy array of x and y coordinates, x in

column 0 :param bounds: A tuple of length 4, describing the min x, max x, min y and max y coordinates of the square. :return: A boolean mask, true is within the square module


Bases: object


A dataframe representation of a point cloud, with some useful functions for manipulating and displaying.

Parameters:las – A path to a las file, a laspy.file.File object, or a CloudFrame object
chm(cell_size, interp_method=None, pit_filter=None, kernel_size=3)

Returns a Raster object of the maximum z value in each cell.

  • cell_size – The cell size for the returned raster in the same units as the parent Cloud or las file.
  • interp_method – The interpolation method to fill in NA values of the produced canopy height model, one of either “nearest”, “cubic”, or “linear”
  • pit_filter – If “median” passes a median filter over the produced canopy height model.
  • kernel_size – The kernel size of the median filter, must be an odd integer.

A Raster object of the canopy height model.


Clips the point cloud to the provided geometry (see below for compatible types) using a ray casting algorithm.

Parameters:geometry – Either a tuple of bounding box coordinates (square clip), an OGR geometry (polygon clip), or a tuple of a point and radius (circle clip).
Returns:A new Cloud object clipped to the provided geometry.
filter(min, max, dim)

Filters a cloud object for a given dimension in place.

  • min – Minimum dimension to retain.
  • max – Maximum dimension to retain.
  • dim – The dimension of interest as a string. For example “z”. This corresponds to a column label in self.las.points dataframe.

Generates a Grid object for this Cloud given a cell size. See the documentation for Grid for more information.

Parameters:cell_size – The resolution of the plot in the same units as the input file.
Returns:A Grid object.
iplot3d(max_points=30000, point_size=0.5, dim='z', colorscale='Viridis')

Plots the 3d point cloud in a compatible version for Jupyter notebooks using Plotly as a backend. If max_points exceeds 30,000, the point cloud is downsampled using a uniform random distribution by default. This can be changed using the max_points argument.

  • max_points – The maximum number of points to render.
  • point_size – The point size of the rendered point cloud.
normalize(cell_size, num_windows=7, dh_max=2.5, dh_0=1, interp_method='nearest')

Normalizes this cloud object in place by generating a DEM using the default filtering algorithm and subtracting the underlying ground elevation. This uses a grid-based progressive morphological filter developed in Zhang et al. (2003).

This algorithm is actually implemented on a raster of the minimum Z value in each cell, but is included in the Cloud object as a convenience wrapper. Its implementation involves creating a bare earth model and then subtracting the underlying ground from each point’s elevation value.

If you would like to create a bare earth model, look instead toward Grid.ground_filter.

Note that this current implementation is best suited for larger tiles. Best practices suggest creating a BEM at the largest scale possible first, and using that to normalize plot-level point clouds in a production setting.

  • cell_size – The cell_size at which to rasterize the point cloud into bins, in the same units as the input point cloud.
  • num_windows – The number of windows to consider.
  • dh_max – The maximum height threshold.
  • dh_0 – The null height threshold.
  • interp_method – The interpolation method used to fill in missing values after the ground filtering takes place. One of any: “nearest”, “linear”, or “cubic”.
plot(cell_size=1, cmap='viridis', return_plot=False)

Plots a basic canopy height model of the Cloud object. This is mainly a convenience function for rasterizer.Grid.plot, check that method docstring for more information and more robust usage cases.

  • cell_size – The resolution of the plot in the same units as the input file.
  • return_plot – If true, returns a matplotlib plt object.

If return_plot == True, returns matplotlib plt object.

plot3d(point_size=1, cmap='Spectral_r', max_points=500000.0)

Plots the three dimensional point cloud using a method suitable for non-Jupyter use (i.e. via the Python console). By default, if the point cloud exceeds 5e5 points, then it is downsampled using a uniform random distribution of 5e5 points. This is for performance purposes.

  • point_size – The size of the rendered points.
  • cmap – The matplotlib color map used to color the height distribution.
  • max_points – The maximum number of points to render.
class, header)

Bases: object

A simple class composed of a numpy array of points and a laspy header, meant for internal use. This is basically a way to load data from the las file into memory.

__init__(points, header)

Initialize self. See help(type(self)) for accurate signature.


Writes the points and header to a .las file.

Parameters:path – The path of the .las file to write to.

pyfor.filter module


Calculates the maximum height difference for an elevation array.

pyfor.filter.dht(elev_array, w_k, w_k_1, dh_0, dh_max, c)

” Calculates dh_t.

  • elev_array – A 1D array of elevation values
  • w_k – An integer representing the window size
  • w_k_1 – An integer representing the previous window size
pyfor.filter.slope(elev_array, w_k, w_k_1)

Calculates the slope coefficient.

Returns the slope coefficient s for a given elev_aray and w_k

pyfor.filter.zhang(array, number_of_windows, dh_max, dh_0, c, grid, interp_method='nearest')

Implements Zhang et. al (2003), a progressive morphological ground filter. This returns a matrix of Z values for each grid cell that have been determined to be actual ground cells.

Parameters:array – The array to interpolate on, usually an aggregate of the minimum Z value

#TODO fix this to be max window size :param number_of_windows: :param dh_max: The maximum height threshold :param dh_0: The starting null height threshold :param c: The cell size used to construct the array :param grid: The grid object used to construct the array :return: An array corresponding to the filtered points, can be used to construct a DEM via the Raster class

pyfor.gisexport module

pyfor.gisexport.array_to_polygons(array, affine)

Returns a geopandas dataframe of polygons as deduced from an array.

  • array – The 2D numpy array to polygonize.
  • affine – The affine transformation.

pyfor.gisexport.array_to_raster(array, pixel_size, x_min, y_max, wkt, path)

Writes a GeoTIFF raster from a numpy array.

  • array – 2D numpy array of cell values
  • pixel_size – – Desired resolution of the output raster, in same units as wkt projection.
  • x_min – Minimum x coordinate (top left corner of raster)
  • y_max – Maximum y coordinate
  • wkt – The wkt string with desired projection
  • path – The output bath of the GeoTIFF

pyfor.plot module

pyfor.plot.iplot3d(las, max_points, point_size, dim, colorscale)

Plots the 3d point cloud in a compatible version for Jupyter notebooks. :return: # TODO refactor to a name that isn’t silly

pyfor.plot.iplot3d_surface(array, colorscale)

pyfor.rasterizer module

class pyfor.rasterizer.Grid(cloud, cell_size)

Bases: object

The Grid object is a representation of a point cloud that has been sorted into X and Y dimensional bins. It is not quite a raster yet. A raster has only one value per cell, whereas the Grid object merely sorts all points into their respective cells.

__init__(cloud, cell_size)

Sorts the point cloud into a gridded form such that every point in the las file is assigned a cell coordinate with a resolution equal to cell_size

  • cloud – The “parent” cloud object.
  • cell_size – The size of the cell for sorting in the units of the input cloud object.

Returns a dataframe with sorted x and y with associated bins in a new columns

boolean_summary(func, dim)

Calculates a column in that is a boolean of whether or not that point is the point that corresponds to the function passed. For example, this can be used to create a boolean mask of points that are the minimum z point in their respective cell.

  • func – The function to calculate on each group.
  • dim – The dimension of the point cloud as a string (x, y or z)

Retrieves the cells with no returns in

return: An N x 2 numpy array where each row cooresponds to the [y x] coordinate of the empty cell.

ground_filter(num_windows, dh_max, dh_0, interp_method='nearest')

Wrapper call for filter.zhang with convenient defaults.

Returns a Raster object corresponding to the filtered ground DEM of this particular grid. :param type: :return:

interpolate(func, dim, interp_method='nearest')

Interpolates missing cells in the grid. This function uses scipy.griddata as a backend. Please see documentation for that function for more details.

  • func – The function (or function string) to calculate an array on the gridded data.
  • dim – The dimension (i.e. column name of self.cells) to cast func onto.
  • interp_method – The interpolation method call for scipy.griddata, one of any: “nearest”, “cubic”, “linear”

An interpolated array.


Calculates summary statistics for each grid cell in the Grid.

Parameters:func_dict – A dictionary containing keys corresponding to the columns of and values that correspond to the functions to be called on those columns.
Returns:A pandas dataframe with the aggregated metrics.
normalize(num_windows, dh_max, dh_0, interp_method='nearest')

Returns a new, normalized Grid object. :return:

plot(func, cmap='viridis', dim='z', return_plot=False)

Plots a 2 dimensional canopy height model using the maximum z value in each cell. This is intended for visual checking and not for analysis purposes. See the rasterizer.Grid class for analysis.

  • func – The function to aggregate the points in the cell.
  • cmap – A matplotlib color map string.
  • return_plot – If true, returns a matplotlib plt object.

If return_plot == True, returns matplotlib plt object.


Not yet implemented.

raster(func, dim)

Generates an m x n matrix with values as calculated for each cell in func. This is a raw array without missing cells interpolated. See self.interpolate for interpolation methods.

  • func – A function string, i.e. “max” or a function itself, i.e. np.max. This function must be able to take a 1D array of the given dimension as an input and produce a single value as an output. This single value will become the value of each cell in the array.
  • dim – The dimension to calculate on as a string, see the column names of for a full list of options

A 2D numpy array where the value of each cell is the result of the passed function.

class pyfor.rasterizer.Raster(array, grid)

Bases: object

__init__(array, grid)

Initialize self. See help(type(self)) for accurate signature.


Plots the raster as a surface using Plotly.


Filters pits in the raster. Intended for use with canopy height models (i.e. grid(0.5).interpolate(“max”, “z”). This function modifies the raster array in place.

Parameters:kernel_size – The size of the kernel window to pass over the array. For example 3 -> 3x3 kernel window.
plot(cmap='viridis', return_plot=False)

Default plotting method for the Raster object.

watershed_seg(min_distance=2, threshold_abs=2, classify=False)

Returns the watershed segmentation of the Raster as a geopandas dataframe.

  • min_distance – The minimum distance between local height maxima in the same units as the input point cloud.
  • threshold_abs – The minimum threshold needed to be called a peak in peak_local_max.
  • classify – If true, sets the user data of the original point cloud data to the segment ID. The segment ID is an arbitrary identification number generated by the labels function. This can be useful for plotting point clouds where each segment color is unique.

A geopandas data frame, each record is a crown segment.


Writes the raster to a geotiff. Requires the attribute to be filled by a projection string (ideally wkt or proj4).

Parameters:path – The path to write to.

pyfor.test module

Module contents