Filtering ground noise from LiDAR returns produces inferior models of forest aboveground biomass in heterogenous landscapes

I have a new open-access article, “Filtering ground noise from LiDAR returns produces inferior models of forest aboveground biomass in heterogenous landscapes”, out today in GIScience & Remote Sensing.

A very common pre-processing step, when deriving predictors for machine learning models from LiDAR point clouds, is to filter out “ground returns” based on a simple height threshold (where all returns below a certain level are removed). This practice comes from the early days of LiDAR, where researchers were attempting to measure the heights of individual trees using LiDAR returns, and makes a lot of sense in that context; if a return didn’t hit a tree, it probably can’t tell you much about how tall that tree is. But the process stuck around as we moved from modeling trees to forests, and knowing that a return could reach the ground – that it was able to find a gap in the canopy to reach the ground through – might reflect important characteristics of stand openness and therefore structure.

We show that filtering out “ground returns” from LiDAR based on height thresholds biases the derived predictors, and that bias can make predictive models of forest structure (in this case, AGB) worse. Particularly for models attempting to represent mixed landscape types, where gaps in the canopy are more common, filtering ground returns might remove valuable information and result in your model making worse predictions.

Or as the cliche machine might put it: As we move to focus on seeing forests more than trees, it’s time for us to see these data as signal, not just noise.

This was a very fun project, with an awesome team (Lucas Johnson, Eddie Bevilacqua, and Colin Beier). I’m very excited to see this one in press!