Legibility and seats at the table

Reflections on FEMC 2022.

Data science
earth science

December 16, 2022


There’s a bit of forestry lore that’s become really well-known in political science, popularized by James C Scott’s “Seeing Like a State”, but is as best as I can tell mostly unknown in English-speaking forest science.1 It goes a little something like this:

Prussia in the 1800s was an industrializing nation in extreme demand for more of every resource they could get their hands on. As part of the industrialization process, the Prussians found themselves needing to approach their natural resources in a more standardized and predictable way; if a factory was expected to make a standard number of chairs each day, it required a standard amount of wood to work with. The normal rhythm of growing seasons and the boom-and-bust cycles of natural systems were antithetical to the increasingly regimented and systematized industry the government was aiming to encourage.

And as such, the government sought to systematize their forests. The 1800s were the beginnings of forest science as a discipline, of allometrics and quantification and active management planning on scales which had not been attempted – or feasible – to earlier societies. The objective of the state was not just increasing productivity – although that was an objective – but to standardize it, to move towards a wood products industry that could support the demands of manufacturing irrespective of forest conditions. Measurements were standardized, growth tables developed, tools and techniques invented, all in the name of progress. And in the name of standardization and control, the state encouraged active management to enforce even-aged monocultural stands which could be easily accounted for, whose expected outputs and timing could be predicted and planned upon in order to ensure the engine of industry stayed alive.2

But of course, standardizing a landscape causes quite a disturbance, and a forest is made up of more than trees. Removing undesirable vegetation removes the habitat that game and other animals require, upsetting historical hunting patterns as well as pollination and seed distribution; clearing the understory and removing slash removes soil nutrients and prevents non-assisted regeneration. The start of systematization saw wood product production peak at never-before-seen heights; a century later, production was down 30% from the pre-standardized baseline. In some situations, management had made it so no trees would regenerate at all. The Prussians introduced one more innovation, a word to describe such places: Waldsterben. Forest death.

Scott tells this story as part of a bigger pattern, of states attempting to standardize complex systems in order to make them “legible” and manageable by remote administrators, at the cost of localized expertise and nuance. No matter how good your measurements and allometrics, there’s really no better way of knowing a forest than a walk in the woods.

Eye in the sky

I was lucky enough to be in Burlington, Vermont for this year’s Forest Ecosystem Monitoring Cooperative (FEMC) conference. It’s among my favorite conferences, not least because Burlington is a beautiful town, but also because the audience is largely made up of land managers from governments and industry. These are people who are in the business of solving problems, and are interested in your work to the extent that it helps them solve problems. It’s a nice curative to some of the tail-chasing tendencies of academia.

Madeleine Desrochers gave an absolutely fantastic talk about her work, using models trained on satellite imagery to detect harvests in the Adirondack forest. In particular, Madeleine was focusing on how we can assess the accuracy of these models, especially working as we do in an area where most harvests aren’t clearcuts and as such can be a bit hard to see from space. The work is fascinating, and the presentation was great, but my particularly favorite bit was when Madeleine spoke about the motivations behind her work.

To paraphrase: this stuff is terrifying. We are not, as a society, used to every action we make on the surface of this Earth being watched by a silent observer in space, or picked up by a model run on a laptop in Syracuse. We are, by and large, neither expecting nor accepting that our actions will be so legible to anyone who cares to look.

But, one way or another, this stuff is the future. There are VC-backed startups promising change detection algorithms to any government agency who will give them the time of day, and these firms do not operate with particularly high standards for accuracy or validation. And it’s a bad outcome for everyone if spotty models are used to help the state make decisions, to identify people violating tax codes or taking more trees than they’re entitled to.

The best case scenario is that we publicly investigate these algorithms to figure out how, when, where, and why they work. And even more importantly, we need domain experts at the table when these algorithms are being considered, to help inform decisions about what it will mean for these things to become legible, who will be helped and who will be harmed, and how foresters and forests will react in years to come. The engineers are going to be implementing these tools no matter what; our best way through it is to make sure that domain experts, that local knowledge are well-represented as they do so, to make outcomes better for us all.

The data sciencification of everything

Also at FEMC, Jarlath O’Neil-Dunne spoke on a panel about his fear that we’re watching the “data scienceification of everything.”3 As noted by others on the panel, forest science has moved from a period of bespoke tooling – clinometers, densiometers, biltmore sticks, D-tapes and the rest of their 19th century companions – towards a place where we’re adapting the tooling of other disciplines for forest problems. This is perhaps not a new pattern, given the amount of farming equipment running on any forest operation, but the rapid shift in the field to a focus on complex models, remote sensing, and “big data” has created something of a gap between leading research and the problems that folks in the field actually need solved.4

Part of the problem is that, as we add more and more programming and data management skills to ecology and forestry curricula, we’re possibly not paying enough attention to what gets cut. Now, I think this problem is typically overstated – there was more than a bit of overlap in my undergrad ecology, natural resources ecology, wetlands ecology and management, forest ecology, forest ecology and management, and natural resources silviculture courses – but it’s not wrong. Learning to code takes time, and that’s time not spent on a walk in the woods, building the sort of domain knowledge that’s essential to understanding how these complex systems actually work.

But we can’t just not teach students how to code. Not only is that educational malpractice – as Richard Hamming put it, teachers should prepare the student for the student’s future, not for the teacher’s past – but also, these tools and techniques aren’t going away, and the world isn’t slowing down. We can’t simply cede this ground to engineers and data scientists; being able to apply knowledge of the underlying system to the problem at hand remains essential for producing not only accurate models but also actionable insights into the world at large. If we want forest science to benefit from recent gains in processing power, in the ability for humans to think about, understand, and manage large-scale complex systems, then we need forest science to have a seat at the table when deciding how these tools are used for our field. We need folks in forest science to have a say in how our world is made legible, before those decisions are made for us.


  1. I personally heard this the first time via Bret Devereaux’s ACOUP, a blog which has both made me smarter and also turned me on to Paradox games, at the cost of hundreds of hours of my life.↩︎

  2. In a very real way, a lot of our forest management is still in this paradigm, though plenty of folks are making valiant efforts to push for more modern management methodologies. Scott makes the point that Pinchot was trained at a school following “a German-style curriculum”; the foundations of forestry as a science are fundamentally German.↩︎

  3. I think I’m mischaracterizing his comments here, for what it’s worth; Jarlath was talking about the importance of other ways of knowing, beyond simple raw processing of quantitative information. The themes here were part of the panel discussion overall, though.↩︎

  4. My personal bugbear here: most global models of ecological attributes – say, biomass – are pointless. Global models trade accuracy for scope, which makes them practically useless for informing decisonmaking at a regional level. Given that most organizations which produce global models aren’t making decisions at a global level, and don’t typically have the ear of those who do, their stakeholders would be better served by a smaller geographic focus and in turn a higher accuracy. That said, if you’re in the business of selling models, it is much more appealing to your P&L if you only need to make one model which you’ll sell to anyone who will buy it; the incentives here put the modeler in direct conflict with stakeholder interests. There are also so, so many other problems with global models, but there’s been a trend recently of papers claiming such issues are surpassable; I think the incentives problem is somewhat more fundamental.↩︎