<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Mike Mahoney</title>
<link>https://mm218.dev/blog.html</link>
<atom:link href="https://mm218.dev/blog.xml" rel="self" type="application/rss+xml"/>
<description>Technology and the Environment</description>
<generator>quarto-1.8.26</generator>
<lastBuildDate>Fri, 19 Dec 2025 00:00:00 GMT</lastBuildDate>
<item>
  <title>Ten short thoughts about AGU25</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2025-12-19-agu/</link>
  <description><![CDATA[ 





<p>This was my first time at AGU in New Orleans – and my first time in New Orleans in general. My review is that I really enjoy New Orleans, and I really enjoy AGU, and I <em>really</em> hate the New Orleans convention center.<sup>1</sup> Mostly, I hate that the convention center is about a mile long from end to end, and talks were in rooms on either side of a single hallway spanning the entire mile – meaning I often had to pass all 36,000 attendees in that single mile-long hall as I went from Global Change (at ~0.2 miles deep) to Informatics sessions (at ~1.0).</p>
<hr>
<p>I found a third wave coffee place (<a href="https://www.fourthwallnola.com/">Fourth Wall</a>) via Reddit and went there each morning. The line at 7:45am got notably longer every single day – including Friday, when I’d assume a decent chunk of folks had already gone home. I wonder if the human geographers have insights into how conferences, with their odd, massive groups of tourists with correlated commutes and schedules, interact with their host cities over the span of the event. Feels like it’d be a great AGU talk.</p>
<hr>
<p>I’m always surprised at how much better the “science and society” talks are than the average session. Then I’m surprised that I’m surprised professional science communicators give good talks. We should fund more science communication. I’m pretty sure I don’t just think this because I work for our “Web Communications Branch”.</p>
<hr>
<p>Why were only random doors unlocked? It seemed like doors marked “Enter Here” had a 50% chance of being open, while the rest of the doors had a 10% chance.</p>
<hr>
<p>We all agree ten minute talks are bad, right? Maybe we don’t need ten talks per session?</p>
<hr>
<p>I’ve always been a skeptic when it comes to automated data harmonization/interoperability. The idea is nice – augment your data with some magic set of metadata fields and <em>boom</em> it can automatically be <code>rbind()</code>’d with any other data that does the same – but real data is usually too messy for this to be practical. My favorite example is that, in 2021, a court in New York State <a href="https://www.adirondackdailyenterprise.com/news/local-news/2021/05/its-not-yet-clear-how-court-ruling-will-change-tree-cutting-on-state-land/">idiotically changed the definition of a “tree” for some purposes</a> from anything above 3 inches diameter to anything above 1 inch. So if you’re measuring trees, now you need to add whether you’re using the standard from before this case or after it – or perhaps the federal standard, which starts at 5 inches. This is before we even get to the fact that “tree” isn’t a well-defined category; rhododendrons usually aren’t included in the category, for instance, even though they can get to 33 feet!</p>
<p>All that to say, resolving methodology differences down to a level where data can be automatically harmonized (or rejected) seems to me a tall task for metadata alone, and I don’t see the AI era solving this problem any time soon.</p>
<hr>
<p>The coffee stations this year all used composable coffee cups, despite the conference center not having a single compost waste bin. That has to be higher impact than the alternative, right? Don’t get me wrong – I flew to Louisiana, this isn’t a meaningful source of my<sup>2</sup> emissions for the week. It’s just a little odd seeing this sort of green-washing at a conference like this.</p>
<hr>
<p>For a conference of geospatial scientists, not a lot of spatial awareness in the crowd.</p>
<hr>
<p>I think we’d all be better off if we were more honest about the role of money in shaping studies. I saw several talks where the sampling region seemed to be defined as roughly “where I could get on half a tank of gas”, and depending what you’re doing and what conclusions you’re drawing, that’s basically fine! We don’t need to pretend you chose the state your university is in because it’s a unique understudied ecotype that the rest of the researchers at your university have somehow overlooked.</p>
<hr>
<p>It seems like a problem that we basically can’t run meaningful and low-stakes experiments with data sharing and discoverability anymore.<sup>3</sup> I was in a few sessions where NASA ESDIS were discussing their project of moving 190 petabytes of imagery into S3, and the solutions they’ve built out to deal with it. And I think the tech looks great – at least, I love STAC and I hear nice enough things about CMR, as someone who’s never touched it. But it seems like a shame that it’s basically impossible for non-NASA people to try building other access patterns, to see if there are places for improvement that we’ve missed. I mean, where would you even start? You don’t have 190 petabytes of data to test with – and you certainly don’t have the money to pay for it, if you did.</p>
<p>Of course, it’s not like folks were able to build open-sourced versions of the Apollo missions, either. Maybe it’s not a bad thing to have smart, dedicated people collaborating together on a single solution for a hard problem. I like the tech they’ve got so far!</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>Currently, I’m annoyed enough that I’m tempted to skip the conference the next time we’re in NOLA, but still grab a hotel nearby and do my usual handshakes and meetings. Realistically, I know myself well enough to know I’ll have completely forgotten by January.↩︎</p></li>
<li id="fn2"><p>Exercise left for the reader: how many coffee cups would you need to go through before it was meaningful? And could you do it before the staff stopped you?↩︎</p></li>
<li id="fn3"><p>Emphasis on meaningful – I specifically mean that independent researchers basically can’t contribute to modern data repository infrastructure, at least not for repositories at operational scales.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>AGU</category>
  <guid>https://mm218.dev/posts/2025-12-19-agu/</guid>
  <pubDate>Fri, 19 Dec 2025 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2025-12-19-agu/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Converting New York’s Forest Carbon Assessment to Tidymodels</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2024-07-19-tidymodels/</link>
  <description><![CDATA[ 





<p>We’ve been working this summer on version 2.0 of <a href="https://cafri-ny.org/new-york-forest-carbon-assessment/">the New York Forest Carbon Assessment</a>, incorporating changes in <a href="https://arxiv.org/abs/2405.04507">how the US estimates its forest carbon resources</a> and building on our past work using <a href="https://doi.org/10.1016/j.foreco.2023.121348">satellite imagery</a> and <a href="https://doi.org/10.1016/j.jag.2022.103059">LiDAR patchworks</a> to create wall-to-wall estimates of forest carbon stocks across New York for the last 30 years.</p>
<p>One of the big themes of version 2.0 is operationalization. We’re working to automate as much of our modeling and mapping pipelines as possible, to streamline generating annual maps as time goes on and to make it easier to experiment with new models and data sources in the future. In practice, this means I’ve been spending a lot of time merging individual scripts into <a href="https://books.ropensci.org/targets/">targets</a> pipelines, and generally looking at ways to streamline our processes.</p>
<p>The biggest change to come from this rewriting is that, while version 1.0 of our modeling pipeline used an internal library for tuning and <a href="https://www.mm218.dev/posts/2021/01/model-averaging/">creating stacked ensembles</a>, version 2.0 shifts these tasks over to the <a href="https://www.tidymodels.org/">tidymodels framework</a>.<sup>1</sup> Where we had previously only been using the <a href="https://rsample.tidymodels.org/">rsample</a> and <a href="https://yardstick.tidymodels.org/">yardstick</a> packages from tidymodels, we’ve now adopted the framework more-or-less wholesale in order to take advantage of a few advanced features.<sup>2</sup></p>
<p>In this post, I want to walk through what that workflow looks like for us – how we get from our field measurements to prepped data for models, then through hyperparameter tuning, into our final ensemble model construction. And then I want to talk about a few rough edges we ran into through the process, and why we’re still using tidymodels despite those stubbed toes.</p>
<p>So, without further ado, let’s start walking through how we’re building our ensemble models using tidymodels.</p>
<section id="building-ensemble-models-using-tidymodels" class="level2">
<h2 class="anchored" data-anchor-id="building-ensemble-models-using-tidymodels">Building ensemble models using tidymodels</h2>
<section id="what-are-we-doing-here" class="level3">
<h3 class="anchored" data-anchor-id="what-are-we-doing-here">What are we doing here?</h3>
<p>My intent with this section is to give a pretty straightforward, if fast-paced, walkthrough of how we’re using tidymodels to build our ensemble models. In order to do that, I need to load quite a few packages:<sup>3</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Used to download our predictor data</span></span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsi)</span>
<span id="cb1-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Contains our "field data", used for cross-validation</span></span>
<span id="cb1-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(spatialsample)</span>
<span id="cb1-5"></span>
<span id="cb1-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Used to wrangle our predictor data</span></span>
<span id="cb1-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(sf)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(terra)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>terra 1.7.71</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(exactextractr)</span>
<span id="cb5-2"></span>
<span id="cb5-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Tidymodels libraries used to actually fit and tune models</span></span>
<span id="cb5-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidymodels)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>── Attaching packages ────────────────────────────────────── tidymodels 1.2.0 ──</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>✔ broom        1.0.5      ✔ recipes      1.0.10
✔ dials        1.2.1      ✔ rsample      1.2.1 
✔ dplyr        1.1.4      ✔ tibble       3.2.1 
✔ ggplot2      3.5.1      ✔ tidyr        1.3.1 
✔ infer        1.0.7      ✔ tune         1.2.1 
✔ modeldata    1.3.0      ✔ workflows    1.1.4 
✔ parsnip      1.2.1      ✔ workflowsets 1.1.0 
✔ purrr        1.0.2      ✔ yardstick    1.3.1 </code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
✖ purrr::discard()  masks scales::discard()
✖ tidyr::extract()  masks terra::extract()
✖ dplyr::filter()   masks stats::filter()
✖ dplyr::lag()      masks stats::lag()
✖ recipes::step()   masks stats::step()
✖ recipes::update() masks terra::update(), stats::update()
• Use tidymodels_prefer() to resolve common conflicts.</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(hardhat)</span>
<span id="cb9-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(bonsai)</span>
<span id="cb9-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(finetune)</span>
<span id="cb9-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(stacks)</span></code></pre></div></div>
</div>
<p>In order to fit a stacked ensemble model, we first need to fit a number of “component” models. For our purposes today, we’ll fit a <a href="https://bradleyboehmke.github.io/HOML/mars.html">MARS model</a> using the <a href="https://cran.r-project.org/package=earth">earth package</a> and a <a href="https://bradleyboehmke.github.io/HOML/gbm.html">gradient boosting model</a> using <a href="https://cran.r-project.org/package=lightgbm">LightGBM</a>. While we don’t need functions from these packages, we do need them to be installed so that tidymodels packages can find them:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">check_installed</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"earth"</span>)</span>
<span id="cb10-2">rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">check_installed</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lightgbm"</span>)</span></code></pre></div></div>
</div>
<p>We also need some data worth modeling. While the actual field data we use for our models is non-public,<sup>4</sup> we can demonstrate the workflow using pretty much any example data. For our purposes, we’ll build our models to predict the amount of tree canopy in Boston in 2019, using data shipped with the spatialsample package:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb11-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(spatialsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>boston_canopy, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> canopy_percentage_2019)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb11-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb11-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_fill_distiller</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">palette =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Greens"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">direction =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-07-19-tidymodels/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This will be our outcome variable for our models.</p>
</section>
<section id="data-preprocessing" class="level3">
<h3 class="anchored" data-anchor-id="data-preprocessing">Data preprocessing</h3>
<p>In order to actually fit the models, we’re going to need predictor data too! We’ll derive our predictors from Landsat imagery taken in the summer of 2019, when all the trees in Boston were at their greenest.</p>
<p>To actually download this data, we’ll use the <a href="https://permian-global-research.github.io/rsi/index.html">rsi package</a>. In order to make the download go faster, we’ll first use <a href="https://future.futureverse.org/">future</a> to register a parallel backend, letting us download multiple pieces of data at the same time:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(future)</span>
<span id="cb12-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plan</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"multisession"</span>)</span></code></pre></div></div>
</div>
<p>And we can then actually download our Landsat imagery using <code>rsi::get_landsat_imagery()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1">landsat_imagery <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rsi<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb13-2">    spatialsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>boston_canopy,</span>
<span id="cb13-3">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2019-06-01"</span>,</span>
<span id="cb13-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2019-09-01"</span>,</span>
<span id="cb13-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">platforms =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"landsat-8"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"landsat-9"</span>),</span>
<span id="cb13-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb13-7">)</span>
<span id="cb13-8"></span>
<span id="cb13-9">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat_imagery))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-07-19-tidymodels/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This winds up generating a median-composite image, made up of all of the “good” data available for this time period.</p>
<p>We normally don’t fit our models using these raw reflectance values, but rather use them to calculate a number of <a href="https://github.com/awesome-spectral-indices/awesome-spectral-indices">spectral indices</a> that we then use as predictors. We can automate this process using rsi as well; for instance, we can use this code to calculate every spectral index possible using the bands in the raster we downloaded:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1">landsat_indices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rsi<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb14-2">    landsat_imagery,</span>
<span id="cb14-3">    rsi<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter_bands</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bands =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat_imagery))),</span>
<span id="cb14-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb14-5">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
|---------|---------|---------|---------|
=========================================
                                          </code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat_indices))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-07-19-tidymodels/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Now we’ve got a raster with a ton of spectral indices that we can use as predictors. We now need to associate those with our canopy percentage measurements from spatialsample. Generally we accomplish this using <a href="https://github.com/isciences/exactextractr">exactextractr</a> to calculate the mean of all of these indices within our sample locations:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1">cell_values <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> exactextractr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exact_extract</span>(</span>
<span id="cb17-2">    terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat_indices),</span>
<span id="cb17-3">    spatialsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>boston_canopy,</span>
<span id="cb17-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mean"</span>,</span>
<span id="cb17-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">progress =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>,</span>
<span id="cb17-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colname_fun =</span> \(values, ...) values</span>
<span id="cb17-7">)</span>
<span id="cb17-8"></span>
<span id="cb17-9">cell_values <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cbind</span>(</span>
<span id="cb17-10">    spatialsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>boston_canopy[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"canopy_percentage_2019"</span>], </span>
<span id="cb17-11">    cell_values</span>
<span id="cb17-12">    )</span>
<span id="cb17-13"></span>
<span id="cb17-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(cell_values)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>]</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Simple feature collection with 6 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 749383.6 ymin: 2913059 xmax: 801174.4 ymax: 2965741
Projected CRS: NAD83 / Massachusetts Mainland (ftUS)
  canopy_percentage_2019  AFRI1600  AFRI2100      ANDWI       AVI     AWEInsh
1              8.6786929 0.2429339 0.4686573 -0.2066887 0.2127504  0.09513274
2             32.9273863 0.4255704 0.6931654 -0.5248111 0.3976697 -0.20890425
3             12.2586142 0.4492964 0.7004566 -0.5057031 0.4135522 -0.17314616
4             15.5646636 0.2985486 0.5540271 -0.3153332 0.2875552 -0.03299006
5              0.4253207 0.1876189 0.3655822 -0.1460433 0.1548586  0.20539446
6             43.3838631 0.4718267 0.7542459 -0.5869194 0.4660867 -0.31844634
                        geometry
1 MULTIPOLYGON (((781922.7 29...
2 MULTIPOLYGON (((752945.6 29...
3 MULTIPOLYGON (((800842.8 29...
4 MULTIPOLYGON (((751419.1 29...
5 MULTIPOLYGON (((772252.2 29...
6 MULTIPOLYGON (((763631.7 29...</code></pre>
</div>
</div>
<p>We’ve now got index values at each of our sampled polygons!</p>
<p>Before we can start using these to predict canopy coverage, we need to do a bit more pre-processing. For instance, for some reason a few of our polygons are missing values in at least one spectral index. While in the real world we’d probably want to debug exactly why this is happening, for our purposes today it’s a lot easier to just drop those rows from our data frame:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb19-1">cell_values <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> cell_values[</span>
<span id="cb19-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">complete.cases</span>(sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_drop_geometry</span>(cell_values)), </span>
<span id="cb19-3">]</span></code></pre></div></div>
</div>
<p>Similarly, a few indices have non-finite values in at least one row, and we’ll want to go ahead and drop those as well:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1">infinite_cols <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(</span>
<span id="cb20-2">    sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_drop_geometry</span>(cell_values), </span>
<span id="cb20-3">    \(x) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">any</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is.infinite</span>(x)),</span>
<span id="cb20-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logical</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb20-5">)</span>
<span id="cb20-6">cell_values[infinite_cols] <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span></span></code></pre></div></div>
</div>
</section>
<section id="model-fitting" class="level3">
<h3 class="anchored" data-anchor-id="model-fitting">Model fitting</h3>
<p>We’ve now got our <code>cell_values</code> object fully ready for modeling workflows. We’re now able to start the actual model-fitting process.</p>
<p>Our first step is to define the formula we’re using to fit our models. Because I’ve got the <code>cell_values</code> dataframe structured with our outcome column in front, followed by all the predictor columns, I can create this formula using the neat <code>DF2formula()</code> utility:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1">formula <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">DF2formula</span>(sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_drop_geometry</span>(cell_values))</span></code></pre></div></div>
</div>
<p>Did you know that’s a function in base R? What a wild language.</p>
<p>Anyway, with our formula built, we’re ready to start actually using tidymodels packages. Our first step is to create a recipe that uses our formula, and will pass that information on to the other tidymodels packages:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb22" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb22-1">recipe <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> recipes<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">recipe</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formula =</span> formula, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> cell_values)</span></code></pre></div></div>
</div>
<p>Up next, we’ll create the model specifications that define what models we want to fit, and pass <code>hardhat::tune()</code> to every hyperparameter that we want to be tuned.<sup>5</sup> We’ll also set the “engine” that we want to use (the package for fitting the models), and the mode of model:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb23-1">lightgbm_spec <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> parsnip<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">boost_tree</span>(</span>
<span id="cb23-2">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">tree_depth =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(),</span>
<span id="cb23-3">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">trees =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(),</span>
<span id="cb23-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">learn_rate =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(),</span>
<span id="cb23-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mtry =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(),</span>
<span id="cb23-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">min_n =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(),</span>
<span id="cb23-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">loss_reduction =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>()</span>
<span id="cb23-8">  ) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb23-9">    parsnip<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_engine</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">engine =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lightgbm"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb23-10">    parsnip<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_mode</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mode =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"regression"</span>)</span>
<span id="cb23-11"></span>
<span id="cb23-12">mars_spec <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> parsnip<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mars</span>(</span>
<span id="cb23-13">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">num_terms =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>(),</span>
<span id="cb23-14">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prod_degree =</span> hardhat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tune</span>()</span>
<span id="cb23-15">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb23-16">    parsnip<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_engine</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"earth"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb23-17">    parsnip<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">set_mode</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"regression"</span>)</span></code></pre></div></div>
</div>
<p>We’ve now got our recipe and our model specifications created, meaning we’ve defined all the preprocessing and modeling we’re asking tidymodels to perform.</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb24" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb24-1">workflow_set <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> workflowsets<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">workflow_set</span>(</span>
<span id="cb24-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(recipe),</span>
<span id="cb24-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(</span>
<span id="cb24-4">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lightgbm =</span> lightgbm_spec,</span>
<span id="cb24-5">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mars =</span> mars_spec</span>
<span id="cb24-6">    )</span>
<span id="cb24-7">)</span>
<span id="cb24-8">workflow_set</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A workflow set/tibble: 2 × 4
  wflow_id        info             option    result    
  &lt;chr&gt;           &lt;list&gt;           &lt;list&gt;    &lt;list&gt;    
1 recipe_lightgbm &lt;tibble [1 × 4]&gt; &lt;opts[0]&gt; &lt;list [0]&gt;
2 recipe_mars     &lt;tibble [1 × 4]&gt; &lt;opts[0]&gt; &lt;list [0]&gt;</code></pre>
</div>
</div>
<p>I’ll talk about this a bit more later, but I think it’s worth noting how complex the thing we just did is. The <code>workflow_set()</code> function just created a “workflowset”, which is a tibble that contains multiple “workflows”<sup>6</sup> which each contain our “recipe” and “model specification” which has some arguments “marked for tuning”<sup>7</sup>. The code we’ve written is pretty minimal – something like 20 lines of code to create the workflowset itself – but it winds up producing quite a number of abstractions that we need to reason around to actually understand what we’re doing.</p>
<p>But with our workflowset created, we’re now able to actually tune our component models! Before we do so, <strong>we need to set our future strategy back to sequential processing</strong>. The LightGBM implementation in bonsai uses parallel processing by default, and our tuning code will also try and run in parallel if we have a future backend registered, resulting in thread contention and making tuning never finish. As such, we’ll run <code>plan("sequential")</code> before tuning:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb26" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb26-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plan</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sequential"</span>)</span></code></pre></div></div>
</div>
<p>And then we’ll perform our tuning. We need to tune each row of our workflowset – each unique recipe/model combination – separately.</p>
<p>Luckily, this is pretty easy using <code>workflowsets::workflow_map()</code>, which can apply a function to each workflow separately. Here, we’ll use this to use <code>finetune::tune_race_anova()</code> to evaluate our models against 10 different sets of hyperparameters<sup>8</sup> – making a point to save the predictions and workflows from each set, as we’ll need those to build our ensemble model later on.</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb27" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb27-1">resamples <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> spatialsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spatial_clustering_cv</span>(cell_values, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb27-2">tuning <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> workflowsets<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">workflow_map</span>(</span>
<span id="cb27-3">    workflow_set,</span>
<span id="cb27-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tune_race_anova"</span>,</span>
<span id="cb27-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resamples =</span> resamples,</span>
<span id="cb27-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">grid =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,</span>
<span id="cb27-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metrics =</span> yardstick<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">metric_set</span>(yardstick<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>rmse),</span>
<span id="cb27-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">control =</span> finetune<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">control_race</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">save_pred =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">save_workflow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb27-9">)</span></code></pre></div></div>
</div>
<p>Our <code>tuning</code> object now contains the outputs from the tuning process. We could use this to, for example, find the best evaluated set of hyperparameters for each of our models:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb28" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb28-1">tuning<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>result <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb28-2">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(tune<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>select_best, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metric =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rmse"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[[1]]
# A tibble: 1 × 7
   mtry trees min_n tree_depth learn_rate loss_reduction .config              
  &lt;int&gt; &lt;int&gt; &lt;int&gt;      &lt;int&gt;      &lt;dbl&gt;          &lt;dbl&gt; &lt;chr&gt;                
1   114   562    12          8    0.00372           18.6 Preprocessor1_Model09

[[2]]
# A tibble: 1 × 3
  num_terms prod_degree .config             
      &lt;int&gt;       &lt;int&gt; &lt;chr&gt;               
1         3           2 Preprocessor1_Model5</code></pre>
</div>
</div>
<p>But we aren’t actually all that interested in just-the-best. Because we’re going to be combining these models into ensembles, we can actually use all the models fit using <em>good enough</em> hyperparameter sets. That process can happen more or less automatically thanks to <a href="https://stacks.tidymodels.org/">stacks</a>, which will fit a lasso regression to determine which of the candidate models are better than useless and combine them into a final prediction:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb30" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb30-1">ensemble <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> stacks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stacks</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb30-2">  stacks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_candidates</span>(tuning) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb30-3">  stacks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">blend_predictions</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb30-4">  stacks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fit_members</span>()</span>
<span id="cb30-5"></span>
<span id="cb30-6">ensemble</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>── A stacked ensemble model ─────────────────────────────────────


Out of 8 possible candidate members, the ensemble retained 4.

Penalty: 0.1.

Mixture: 1.


The 4 highest weighted members are:</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 4 × 3
  member               type       weight
  &lt;chr&gt;                &lt;chr&gt;       &lt;dbl&gt;
1 recipe_lightgbm_1_09 boost_tree  0.418
2 recipe_mars_1_4      mars        0.315
3 recipe_lightgbm_1_05 boost_tree  0.169
4 recipe_mars_1_6      mars        0.132</code></pre>
</div>
</div>
<p>Now, the best thing about stacks<sup>9</sup> is that predicting with this model is dead simple. Even though predicting with our ensemble requires predicting with each of the selected earth and LightGBM models, and then predicting with the final lasso regression, from our perspective as users we can just call a single, typical <code>predict()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb33" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb33-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">predict</span>(ensemble, cell_values) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 6 × 1
  .pred
  &lt;dbl&gt;
1  7.81
2 33.6 
3 27.2 
4 16.6 
5  1.25
6 43.8 </code></pre>
</div>
</div>
<p>For our carbon mapping workflows, however, we wind up generating predictions using the same 30 meter raster we grabbed our original predictor values from. Ensembles from stacks also work pretty well with this workflow:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb35" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb35-1">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">predict</span>(</span>
<span id="cb35-2">    terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat_indices),</span>
<span id="cb35-3">    ensemble</span>
<span id="cb35-4">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb35-5">    terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
|---------|---------|---------|---------|
=========================================
                                          </code></pre>
</div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-07-19-tidymodels/index_files/figure-html/unnamed-chunk-20-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>We’ve also been using <a href="https://rstudio.github.io/bundle/">bundle</a> to help us serialize these ensemble models and load them in new R sessions. The flexibility of bundle is another great plus of leaning in with tidymodels.</p>
</section>
</section>
<section id="some-complaints" class="level2">
<h2 class="anchored" data-anchor-id="some-complaints">Some complaints</h2>
<p>That said, as I mentioned at the start, I’ve certainly had a few unpleasant experiences through this transition, and I want to talk about them now.</p>
<p>I want to be clear at the start – I’m not intending for this to be some sort of “tidymodels skeptic” post. We’re still using tidymodels, and I think our codebase is more maintainable and – with some caveats – understandable following this ground-up rewrite. These are just the things that bit us during that rewriting process.</p>
<p>Another disclaimer here is that I interned with the tidymodels team back in 2022<sup>10</sup> and deeply enjoyed my time with the team. As a result, my experience with this rewrite was pretty nonstandard; I was already pretty familiar with some corners of the ecosystem (especially the parts touching rsample), am pretty willing to push past frustration with other packages based on my trust in the people developing them, and could DM a team member if I was ever particularly stuck. So this isn’t the most objective outsider experience you could ever imagine; it’s just the one I had.</p>
<section id="learning-one-thing-requires-learning-everything" class="level3">
<h3 class="anchored" data-anchor-id="learning-one-thing-requires-learning-everything">Learning one thing requires learning everything</h3>
<p>A recurring challenge I have with the tidymodels framework is that often, to understand one part of tidymodels, it can feel like you need to know all of it.</p>
<p>For example: if we want to automatically tune a hyperparameter, we need to set it to <code>tune()</code> inside of our model specification. But what does <code>tune()</code> actually do – how does it determine the hyperparameter values worth searching?</p>
<p>Well, the <code>tune()</code> function itself comes from the <a href="https://hardhat.tidymodels.org/">hardhat</a> package, which describes itself as a “developer focused package designed to ease the creation of new modeling packages”. That means that if I’m trying to figure out what this <code>tune()</code> function actually does, running <code>?tune()</code> winds up getting me the help page <a href="https://hardhat.tidymodels.org/reference/tune.html">of a developer-focused package</a>, and I start wondering how much of that developer-focused documentation I need to read and learn in order to tune my models.</p>
<p>For its part, the <code>tune()</code> help page actually points you to <code>tune::tune_grid()</code> and <code>tune::tune_bayes()</code> for more information on how these parameter values are being set. I don’t see any section here that explicitly says how parameter ranges are set when using <code>tune()</code> but the “Parameter Grids” section of <code>tune_grid()</code> and “Parameter Ranges and Values” section of <code>tune_bayes</code> do both discuss tuning parameter ranges. These sections don’t mention <code>tune()</code>,<sup>11</sup> but <em>do</em> mention that <code>dials::finalize()</code> “can be used to derive the data-dependent parameters”.</p>
<p>We don’t call <code>dials::finalize()</code> ourselves – we don’t actually load <code>dials</code> at all – but if we go to the dials website, we could read the <a href="https://dials.tidymodels.org/articles/dials.html">Getting Started vignette</a><sup>12</sup> which starts to explain how parameter ranges are estimated. The “Unknown Values” section sort of explains how some ranges – like the max value for <code>mtry</code> and so on – are calculated, but I’m still a bit lost on how our <code>hardhat::tune()</code> gets turned into a dials function (if it does) and when these parameter ranges are set.</p>
<p>So I haven’t entirely figured out how these grids are made, but I do have five tabs open from three different package websites. At this point, I’m starting to get a little bit buried by the different layers going on here – and I don’t know how much of hardhat, dials, or tune I need to understand before things start making a bit more sense. I also don’t have an obvious first point of attack to answer this question, as the packages all link back to each other as places to get further information.<sup>13</sup></p>
<p>A bit of speculation: I think the issue I run into here is that the boundaries between the packages – where the logical breaks between concepts are <em>for developers</em> – don’t line up with where the logical breaks between concepts are for me, as a user. There’s no one package that owns the concept of “tuning a hyperparameter” – tune owns the grid searching, hardhat owns the infrastructure, dials owns the grid construction. As a result, each package’s documentation explains the piece of tuning that package covers, but none of them explain the concept as a whole.</p>
</section>
<section id="hidden-options-can-cause-delayed-failures" class="level3">
<h3 class="anchored" data-anchor-id="hidden-options-can-cause-delayed-failures">Hidden options can cause delayed failures</h3>
<p>This particular issue is more likely my fault – the behavior I’m about to complain about is documented, and is a valid design for an interface to have, but I got badly bitten by it and want to whinge.</p>
<p>The default arguments to tuning functions – specifically, those in <code>control_grid()</code> – are set (I think) to create small, efficient objects. This is a sensible goal to have. The problem is that it also makes these objects less useful, in a way that you can’t fix after the fact. You’ll only learn that your pipeline is broken after already spending a lot of time running computationally intensive tuning code.</p>
<p>To be more specific: say that you tried to tune the same workflow that we used above, but didn’t know that you needed to specify that you wanted to save the workflow:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb37" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb37-1">tuning <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> workflowsets<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">workflow_map</span>(</span>
<span id="cb37-2">    workflow_set,</span>
<span id="cb37-3">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tune_race_anova"</span>,</span>
<span id="cb37-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resamples =</span> resamples,</span>
<span id="cb37-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">grid =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,</span>
<span id="cb37-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metrics =</span> yardstick<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">metric_set</span>(yardstick<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>rmse)</span>
<span id="cb37-7">)</span></code></pre></div></div>
</div>
<p>If we then want to fit the best model, the obvious choice would be to use <code>tune::fit_best()</code>. But if we call it on our tuning results:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb38" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb38-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb38-2">    tuning<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>result <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb38-3">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(tune<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>fit_best, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metric =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rmse"</span>)</span>
<span id="cb38-4">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error in FUN(X[[i]], ...) : 
  ✖ The control option `save_workflow = TRUE` should be used when tuning.</code></pre>
</div>
</div>
<p>We get an error. Now, if we had read the documentation for <code>fit_best()</code>, we’d see that the documentation for the <code>x</code> argument ends by saying “The control option <code>save_workflow = TRUE</code> should have been used” – so this <em>is</em> documented. But if you missed that, or if you’re writing your pipeline as you go,<sup>14</sup> you’re just told – in past tense – that you did the wrong thing and need to try again, throwing out the computational time you spent tuning the first time around.<sup>15</sup></p>
<p>Now, the version of this that actually bit me came when I tried to add my tuned models to an ensemble using <code>add_candidate()</code>. This function has a similar pattern to <code>fit_best()</code>, except it wants you to have saved both your workflow and your predictions:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb40" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb40-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb40-2">    stacks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stacks</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb40-3">        stacks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">add_candidates</span>(tuning)</span>
<span id="cb40-4">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error in .f(out, .x[[x_i]], .y[[y_i]], ...) : 
  The inputted `candidates` argument was not generated with the
appropriate control settings. Please see `control_stack()`
(`?stacks::control_stack()`).</code></pre>
</div>
</div>
<p>And again, this is documented, for instance in the <a href="https://stacks.tidymodels.org/articles/basics.html#define-candidate-ensemble-members">“define candidate ensemble members”</a> section of one vignette<sup>16</sup> and in <a href="https://stacks.tidymodels.org/reference/add_candidates.html">one of the argument definitions for <code>add_candidates()</code></a>. But I didn’t see it when I skimmed the stacks documentation initially, and as a result I wound up wasting about ~26 hours spent tuning without saving workflows or predictions. The defaults here are good for making your objects smaller, but mean that tidymodels workflows fail <em>extremely</em> slowly.</p>
<p>Part of the issue with this particular example is that the way these options get set is pretty indirect, and I get a bit lost in the documentation. My tuning here is controlled by <code>workflowsets::workflow_map()</code>, which has <a href="https://workflowsets.tidymodels.org/reference/workflow_map.html">five documented arguments</a>: the workflow set, the tuning function to run, a flag for verbosity and a random seed, and then <code>...</code>, documented as “Options to pass to the modeling function”. The word “control” doesn’t appear on this page.</p>
<p>A minor nit is that I don’t think of <code>tune_*</code> as being <em>modeling</em> functions, but rather <em>tuning</em> functions, so <code>...</code> doesn’t jump out at me as a place to set my <code>save_pred</code> and <code>save_workflow</code> arguments. But I think a bigger problem I had is that if I <em>do</em> click over into the documentation for my tuning function, in this case <a href="https://finetune.tidymodels.org/reference/tune_race_anova.html"><code>finetune::tune_race_anova</code></a>, I still don’t see obvious places to set these arguments.</p>
<p>Instead, the <code>tune_*</code> functions have a <code>control</code> argument which itself takes the output of another function. There are a surprising number of these control functions – at least <a href="https://finetune.tidymodels.org/reference/control_race.html"><code>control_race()</code></a> and <a href="https://finetune.tidymodels.org/reference/control_sim_anneal.html"><code>control_sim_anneal()</code></a> from finetune, plus <a href="https://tune.tidymodels.org/reference/control_bayes.html"><code>control_bayes()</code></a> and <a href="https://tune.tidymodels.org/reference/control_grid.html"><code>control_grid()</code></a> from tune<sup>17</sup> – which I <em>think</em> are paired 1:1 with tuning functions.</p>
<p>I’ll be honest, I don’t actually know what these functions do at a fundamental level,<sup>18</sup> or if they’re sharing some plumbing under the hood which makes this indirection much more efficient for the developers. All I know is that I need to go through a decent number of man pages and a bit of guesswork to land on this final chunk of code:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb42" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb42-1">workflowsets<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">workflow_map</span>(</span>
<span id="cb42-2">    workflow_set,</span>
<span id="cb42-3">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"tune_race_anova"</span>,</span>
<span id="cb42-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">resamples =</span> resamples,</span>
<span id="cb42-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">grid =</span> grid,</span>
<span id="cb42-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">metrics =</span> yardstick<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">metric_set</span>(yardstick<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>rmse),</span>
<span id="cb42-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">control =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">control_race</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">verbose =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">save_workflow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">save_pred =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb42-8">  )</span></code></pre></div></div>
<p>So, in order to override the default arguments so that I can fit my models or create an ensemble after tuning, I need to pass the new values to <code>control_race()</code>, and then pass the output from <code>control_race()</code> to <code>tune_race_anova</code> by providing it as a named argument <code>metrics</code> to <code>...</code>. That’s a lot of deep magic to learn – and a lot of concepts to be thinking about at the same time in order to use this function.</p>
<p>In some sense, this is another flavor of “learning one thing requires learning everything”. Because a lot of functions that you’d use later in a workflow require specific non-default arguments in earlier steps, you need to know every function you intend on using when you start writing a pipeline. And as understanding the arguments to these functions generally requires understanding multiple other functions in multiple other packages, it becomes difficult to learn the framework as you go, or to think about modeling pipelines in smaller chunks.</p>
</section>
</section>
<section id="were-still-using-tidymodels" class="level2">
<h2 class="anchored" data-anchor-id="were-still-using-tidymodels">We’re still using tidymodels</h2>
<p>With those complaints now vented, I wanted to close by saying that the pipeline I was building – the one that originally motivated this post – is still using tidymodels, and is going to keep using the framework for the forseeable future. There’s a few really nice features of the framework that motivated the switch:</p>
<ul>
<li>First, <a href="https://finetune.tidymodels.org/">finetune</a> is really cool. We’ve been using finetune to conduct a much more thorough grid search than would be possible if we needed to see every set of hyperparameters through to the end, and to do so faster than we could do the smaller searches we had been using.</li>
<li>Second, <a href="https://stacks.tidymodels.org/">stacks</a> is also great. A generic interface to fit ensemble models (<a href="https://www.mm218.dev/posts/2021/01/model-averaging/">something I’ve written about before</a>), which easily accepts new models and writes its own predict methods? Sign me up. Wanting to use these two packages then motivated us to use tune, and deal with the pain points I mentioned above.<sup>19</sup></li>
<li>Third, even before we adopted the full tidymodels framework, we were already using packages like <a href="https://rsample.tidymodels.org/">rsample</a> and <a href="https://yardstick.tidymodels.org/">yardstick</a> (not to mention my extensions for these packages, <a href="https://spatialsample.tidymodels.org/">spatialsample</a> and <a href="https://docs.ropensci.org/waywiser/">waywiser</a><sup>20</sup>) to assess our models. We’ve always found a number of individual tidymodels packages to be extremely useful; most of our pain points (and most of the complaints above) come when we need to hand things off between two packages.</li>
<li>And finally, I do think our final pipeline here is more readable, and will likely be more maintainable moving forward. Only time will tell, of course, but I’d rather be reading code that included a lot of <code>tune_race_anova()</code> and <code>boost_tree()</code> than the model-specific interfaces and nested <code>lapply()</code> calls that it replaced.</li>
</ul>
<p>So given all that, we’re still using the tidymodels framework. By my estimation, the benefits have been worth the growing pains so far, and with our modeling pipelines now operational we’re hoping that we’re mostly finished with the worst of the surprises. We’re just making extra sure to highlight the bandaids we’ve had to add to the pipeline, wherever we got caught on a rough edge – and hopefully this post helps flag a few of them for you as well!</p>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>A slightly less relevant change is that I wrote most of the pipeline while <a href="https://github.com/posit-dev/positron">beta testing Posit’s new Positron IDE</a>, which I mention mostly as a humblebrag. I generally really like Positron – this post was written using it – though I have a really hard time specifying why. It just <em>feels</em> nice to use.↩︎</p></li>
<li id="fn2"><p>By the way – if you work on a package I mention in this post, and it would be useful for you to say “Used in New York State’s Forest Carbon Assessment” somewhere, drop me a line. We’re part of New York’s <a href="https://climate.ny.gov/resources/scoping-plan">Climate Act Scoping Plan</a> and <a href="https://dec.ny.gov/nature/forests-trees/climate-change/climate-smart-farms-forests">partner with the state DEC</a> on our mapping, and your work is helping make that possible.↩︎</p></li>
<li id="fn3"><p>I’m throwing this in a footnote because it doesn’t <em>really</em> fit anywhere else: There’s a trendy sort of post where people talk about how the tidyverse or tidymodels ecosystem being made up of many packages is itself inherently bad. I think this is a bad opinion. We’re asking these ecosystems to do a lot of complicated things for us, which typically means we’re asking them to include a lot of functions and code. It’s natural – and good practice – for these functions to get broken up into logical units, each providing an interface for users and the rest of the ecosystem to build upon. As such, I think that measuring supply-chain risk based on a pure count of dependencies is the wrong approach. If you’re worried about dependencies changing their interfaces or getting archived and breaking your workflows, I personally think you should be more concerned about multitool-packages developed by non-tenured academics than about the ecosystem maintained by a small team of professional software engineers.↩︎</p></li>
<li id="fn4"><p>I’m realizing now, four-ish years into this job, that I don’t actually know the right phrasing for FIA data. It’s not “confidential”, as no one in our group has a security clearance, but we have access to plot locations under an MOU with the Forest Service. I guess “non-public” is the best I’ve got at the moment.↩︎</p></li>
<li id="fn5"><p>There’s a note in the tune documentation that some models – specifically earth models, but presumbably LightGBM could do the same thing – <a href="https://tune.tidymodels.org/articles/extras/optimizations.html#sub-model-speed-ups">can evalute more models than are fit, speeding up tuning times</a>. I don’t actually see any information on how to combine this trick with tune itself, though, so we’re just going to pretend I didn’t see that part of the docs.↩︎</p></li>
<li id="fn6"><p>Which come from a package, workflows, that we won’t ever call explicitly here.↩︎</p></li>
<li id="fn7"><p>With parameter values determined by the dials package which, again, we won’t call explicitly here↩︎</p></li>
<li id="fn8"><p>Which is a small number chosen to run quickly for this blog post; in practice we tune with a much, much larger grid.↩︎</p></li>
<li id="fn9"><p>I mean, in my opinion.↩︎</p></li>
<li id="fn10"><p>Go <a href="https://github.com/tidymodels/spatialsample">star spatialsample on GitHub</a>↩︎</p></li>
<li id="fn11"><p>In the context of grid creation – <code>tune_grid()</code> mentions <code>tune()</code> in the context of user-provided grids.↩︎</p></li>
<li id="fn12"><p>Not linked from the <code>finalize()</code> page, but in the top navigation bar.↩︎</p></li>
<li id="fn13"><p>For what it’s worth, a full explanation of how tuning works is in <a href="https://www.tmwr.org/tuning">the tidymodels book</a>, but I didn’t see that linked anywhere in this chain.↩︎</p></li>
<li id="fn14"><p>Meaning you didn’t look up the code you needed post-tuning until you were done tuning.↩︎</p></li>
<li id="fn15"><p>If you’re tuning a workflow, I think there <em>is</em> a workaround where you can use <code>select_best()</code>, <code>finalize_workflow()</code>, <code>fit()</code>, and <code>extract_workflow()</code> to get your fitted workflow without needing to retune everything here. I’m not entirely sure– I’m guessing based off of the “Details” and “See also” sections of the <code>fit_best()</code> documentation, and this doesn’t seem to work with workflowsets.↩︎</p></li>
<li id="fn16"><p>Though isn’t spelled out in either other vignette, or the README↩︎</p></li>
<li id="fn17"><p>I <em>think</em> it’s most fair to not include <code>control_resamples()</code> or <code>control_last_fit()</code> here, because I <em>think</em> they’re not used for tuning, but I don’t actually know. A particular oddity is that <code>control_grid()</code> is listed in the tune reference as a developer function, but is used throughout the documentation, while <code>control_last_fit()</code> is listed as a “function for tuning” and isn’t used in any examples.↩︎</p></li>
<li id="fn18"><p>Creating a list with a finite set of named elements, with some default values? The print method just says <code>&lt;x&gt; control object</code>.↩︎</p></li>
<li id="fn19"><p>We had been hand-rolling our own ensemble code, which was <em>mostly</em> painless. The hardest part initially was writing a predict method that would generate predictions from all of our component models, and then combine those into a final ensemble prediction. The function we wrote to handle this could easily extend to use multiple models from the same package – if we wanted to use add another LightGBM to the mix, for instance – but adding new packages required adding new complicated chunks to the predict method. We were interested in making future extensions easier, so stacks seemed like an obvious choice – but our home-rolled grid search code was a lot more straightforward than tune, so we were initially resistant to making the jump.↩︎</p></li>
<li id="fn20"><p>The waywiser package was originally developed explicitly for the NYS carbon mapping project, as it happens; we use just about every function in waywiser as part of our assessment process.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>tidymodels</category>
  <category>R</category>
  <category>Tutorials</category>
  <category>AGB</category>
  <category>Data science</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <category>machine learning</category>
  <guid>https://mm218.dev/posts/2024-07-19-tidymodels/</guid>
  <pubDate>Fri, 19 Jul 2024 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2024-07-19-tidymodels/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>New allometric models for the USA create a step-change in forest carbon estimation, modeling, and mapping</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2024-05-08-nsvb/</link>
  <description><![CDATA[ 





<p>Late last year, the United States Department of Agriculture’s Forest Inventory and Analysis Program (FIA) quietly announced that they were changing how they calculated vegetation structural parameters – specifically the volume, biomass, and carbon content of trees across the USA – to use a <a href="https://www.fs.usda.gov/research/programs/fia/nsvb">brand-new set of allometric models</a>. These models are a great step forward. They do a much better job of accounting for the non-merchantable parts of a tree and provide a nationally consistent method for how we calculate these values, replacing the old system which had sharp deviations across regional (not ecological) boundaries.</p>
<p>But fundamentally, this change means that going forward FIA is providing a different set of estimates relative to what they have provided for years. Anyone who’s building models from FIA data is going to need to update their workflows to adapt. In <a href="https://arxiv.org/abs/2405.04507">a new preprint out today</a>, we – a team led by Lucas Johnson and with Grant Domke and Colin Beier – start exploring what that might mean for model-based carbon estimates and maps.</p>
<p>What we find is that, generally, the shift to these new allometrics is complicated! The new values aren’t pure rescalings of old estimates – while you can get <em>close</em> by using a linear model to update old estimates to the new allometrics, it’s not a perfect match, and some of the remaining variation is likely due to environmental factors and species composition across the landscape. And while model-based estimation approaches are still effective at expanding from field measurements to point-in-time carbon estimates across the landscape, we find that the new allometrics attribute a lot of forest growth to forests with full crowns, where passive remote sensing based models tend to saturate. That means that these models, even though they’re good at making point-in-time estimates, are worse at tracking growth over time in mature forested landscapes.</p>
<div id="fig-stockchange" class="quarto-float quarto-figure quarto-figure-center anchored" alt="Three maps of New York State, showing estimates of carbon stock changes between 2005 and 2019 under old allometrics and new allometrics, as well as the difference between these two maps.">
<figure class="quarto-float quarto-float-fig figure">
<div aria-describedby="fig-stockchange-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
<img src="https://mm218.dev/posts/2024-05-08-nsvb/stock_change_maps.png" class="img-fluid figure-img" alt="Three maps of New York State, showing estimates of carbon stock changes between 2005 and 2019 under old allometrics and new allometrics, as well as the difference between these two maps.">
</div>
<figcaption class="quarto-float-caption-bottom quarto-float-caption quarto-float-fig" id="fig-stockchange-caption-0ceaefa1-69ba-4598-a22c-09a6ac19f8ca">
Figure&nbsp;1: Stock-change map comparison. a) 15-year NSVB stock-change map (<img src="https://latex.codecogs.com/png.latex?%5CDelta"> NSVB; 2019 AGB - 2005 AGB). b) 15-year CRM stock-change map (<img src="https://latex.codecogs.com/png.latex?%5CDelta"> CRM; 2019 AGB - 2005 AGB). c) Stock-change difference map computed as <img src="https://latex.codecogs.com/png.latex?%5CDelta"> NSVB minus <img src="https://latex.codecogs.com/png.latex?%5CDelta"> CRM. Mapped values capped at +/- 40 <img src="https://latex.codecogs.com/png.latex?%5Coperatorname%7BMg%5C%20ha%5E%7B-1%7D%7D">.
</figcaption>
</figure>
</div>
<p>There’s plenty of more information about these challenges, alongside other explorations of what the new allometrics will mean for model-based estimation workflows, <a href="https://arxiv.org/abs/2405.04507">in the preprint</a>.</p>
<p>This was a really fun paper to write – and as is tradition, we originally thought it would be a “quick hit” sort of project when we started on it back in Novemeber, when the new allometrics were released. It’s also (I think!) my first time actually “co-writing” a paper, where I wrote the Introduction and Methods section and then handed that shell to Lucas so he could actually do the real work involved. I enjoyed that process and think it produced a good result here; I’m really happy to see this preprint go out!</p>



 ]]></description>
  <category>ecology</category>
  <category>papers</category>
  <category>remote sensing</category>
  <category>machine learning</category>
  <category>earth science</category>
  <category>AGB</category>
  <guid>https://mm218.dev/posts/2024-05-08-nsvb/</guid>
  <pubDate>Wed, 08 May 2024 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2024-05-08-nsvb/banner.png" medium="image" type="image/png" height="96" width="144"/>
</item>
<item>
  <title>Test warnings faster</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2024-04-12-testing-expensive-functions/</link>
  <description><![CDATA[ 





<p>Here’s another small little note from package development corner (see also <a href="https://www.mm218.dev/posts/2023-11-07-classed-errors/">using classed errors in rlang</a>, <a href="https://www.mm218.dev/posts/2023-10-27-minimal-environments/">executing untrusted code in minimal environments</a>, and <a href="https://www.mm218.dev/posts/2023-08-29-allocations/">not pre-allocating vectors isn’t as bad as it used to be</a>). Say you’ve got some function in your package that takes a <em>long</em> time to execute:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">func <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) {</span>
<span id="cb1-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Sys.sleep</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb1-3">  x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>L</span>
<span id="cb1-4">}</span></code></pre></div></div>
</div>
<p>Maybe the function is downloading data over a network connection, maybe it’s doing a <em>ton</em> of computations, maybe it’s not written super efficiently but you’ve got other priorities right now – the point is, this function takes a long time to execute, and that’s not going to change.</p>
<p>But you still want to properly check user inputs and throw warnings/errors as appropriate. For instance, a clear issue with this function is that it will overflow to NA when given a large integer <code>x</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">func</span>(.Machine<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>integer.max)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in x * 2L: NAs produced by integer overflow</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] NA</code></pre>
</div>
</div>
<p>So maybe we add some code to give a friendly warning about this situation, to hopefully make the specific issue clearer for our users:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">func <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) {</span>
<span id="cb5-2">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> (.Machine<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>integer.max <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>L)) {</span>
<span id="cb5-3">    rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">warn</span>(</span>
<span id="cb5-4">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"`x` is too large, so this function will return NA"</span>,</span>
<span id="cb5-5">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"big_x"</span></span>
<span id="cb5-6">    )</span>
<span id="cb5-7">  }</span>
<span id="cb5-8">  </span>
<span id="cb5-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Sys.sleep</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb5-10">  x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>L</span>
<span id="cb5-11">}</span></code></pre></div></div>
</div>
<p>And because we’re diligent package developers, we want to test to make sure that this warning fires when we’d expect. Since we’re <a href="https://www.mm218.dev/posts/2023-11-07-classed-errors/">using a classed error</a>, we can write a test to make sure that specifically our <code>big_x</code> warning fires when we pass an <code>x</code> that’s too big:<sup>1</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(testthat)</span>
<span id="cb6-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">suppressMessages</span>(testthat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">local_edition</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>))</span>
<span id="cb6-3"></span>
<span id="cb6-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">test_that</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"large integers get a custom warning"</span>, {</span>
<span id="cb6-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expect_warning</span>(</span>
<span id="cb6-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">func</span>(.Machine<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>integer.max),</span>
<span id="cb6-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"big_x"</span></span>
<span id="cb6-8">  )</span>
<span id="cb6-9">})</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Test passed </code></pre>
</div>
</div>
<p>This is all good practice!<sup>2</sup> But it has one big downside: whenever we run this function, we need to wait for the entire function to finish before our test passes. Which means for expensive functions, these can be pretty expensive tests:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">tictoc<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tic</span>()</span>
<span id="cb8-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">test_that</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"large integers get a custom warning"</span>, {</span>
<span id="cb8-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expect_warning</span>(</span>
<span id="cb8-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">func</span>(.Machine<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>integer.max),</span>
<span id="cb8-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"big_x"</span></span>
<span id="cb8-6">  )</span>
<span id="cb8-7">})</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Test passed </code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">tictoc<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">toc</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>3.056 sec elapsed</code></pre>
</div>
</div>
<p>What we can do instead is use <code>tryCatch()</code> to promote this specific warning into an error, aborting the function (and not triggering any of the expensive code). By giving that new error its own class, and using <code>expect_error()</code> to check for an error of that class, we’re able to make sure that our warning has fired (and no other errors happened) without needing to wait:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">tictoc<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tic</span>()</span>
<span id="cb12-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">test_that</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"large integers get a custom warning"</span>, {</span>
<span id="cb12-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expect_error</span>(</span>
<span id="cb12-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tryCatch</span>(</span>
<span id="cb12-5">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">func</span>(.Machine<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>integer.max),</span>
<span id="cb12-6">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">big_x =</span> rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">abort</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"the warning fired"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"success"</span>)</span>
<span id="cb12-7">    ),</span>
<span id="cb12-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"success"</span></span>
<span id="cb12-9">  )</span>
<span id="cb12-10">})</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Test passed </code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1">tictoc<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">toc</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>0.028 sec elapsed</code></pre>
</div>
</div>
<p>Now, an obvious downside is that we’re no longer testing to make sure the function works <em>after</em> the warning gets fired. In this case, where we’re expecting that triggering this warning means this function will return <code>NA</code>, we should probably be testing to make sure that this function actually <em>does</em> return <code>NA</code> after the warning fires. But in plenty of other situations this can be a useful way to speed up your test suites while still making sure that you’re giving your users as much feedback as possible, when you’re expecting to give it.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>I’ve been bitten so many times by tests that expect <em>a</em> warning, rather than a <em>specific</em> warning. Giving functions malformed input often triggers multiple warnings, so if you aren’t checking for your specific warning message or class, you might be surprised that your custom warning never actually fires!↩︎</p></li>
<li id="fn2"><p>Well, the classed warnings and testing specifically for that warning. The function is a mess.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Package development</category>
  <guid>https://mm218.dev/posts/2024-04-12-testing-expensive-functions/</guid>
  <pubDate>Fri, 12 Apr 2024 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2024-04-12-testing-expensive-functions/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>rsi 0.2.0 is now on CRAN!</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2024-03-29-rsi-020/</link>
  <description><![CDATA[ 





<p>I’m deligted to announce that version 0.2.0 of rsi, my package for handling common spatial ML data pre-processing tasks, is <a href="https://cran.r-project.org/package=rsi">now officially on CRAN</a>. rsi aims to handle downloading, masking, rescaling, and compositing data from STAC endpoints, computing spectral indices from that same data, amd wrangling the outputs into bricks ready for modeling workflows – and to do so in a user-friendly and extensible way. This release adds wrappers for more data sources, makes it easier to download high-quality water data from Landsat, and fixes some bugs while simplifying the internals of the package.</p>
<p>You can install rsi from CRAN via:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>)</span></code></pre></div></div>
<p>This post will walk through a few of the most user-visible changes, starting with…</p>
<section id="downloading-water-data" class="level2">
<h2 class="anchored" data-anchor-id="downloading-water-data">Downloading water data</h2>
<p>In older versions of rsi, the default <code>landsat_mask_function()</code> would mask your data so that your final files (and composites) only contained the highest quality observations over land. That meant that waterbodies (like the large area in the top left of this image) would always be empty:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsi)</span>
<span id="cb2-2">future<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plan</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"multisession"</span>)</span>
<span id="cb2-3"></span>
<span id="cb2-4">aoi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">76.1376841</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">43.0351335</span>))</span>
<span id="cb2-5">aoi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_set_crs</span>(sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_sfc</span>(aoi), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>)</span>
<span id="cb2-6">aoi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_buffer</span>(sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_transform</span>(aoi, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5070</span>), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>)</span>
<span id="cb2-7"></span>
<span id="cb2-8">landsat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb2-9">  aoi,</span>
<span id="cb2-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-06-01"</span>,</span>
<span id="cb2-11">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-08-31"</span>,</span>
<span id="cb2-12">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb2-13">)</span>
<span id="cb2-14"></span>
<span id="cb2-15">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-03-29-rsi-020/index_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This works great if you only care about land observations, but has an obvious flaw otherwise. Thanks to <span class="citation" data-cites="mateuszrydzik">@mateuszrydzik</span>, <code>landsat_mask_function()</code> starting in version 0.2.0 now has an argument, <code>include</code>, which you can use to also include high quality observations over water in your final outputs:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">landsat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb3-2">  aoi,</span>
<span id="cb3-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-06-01"</span>,</span>
<span id="cb3-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-08-31"</span>,</span>
<span id="cb3-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mask_function =</span> \(r) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">landsat_mask_function</span>(r, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">include =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"both"</span>),</span>
<span id="cb3-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb3-7">)</span>
<span id="cb3-8"></span>
<span id="cb3-9">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-03-29-rsi-020/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>You can also set <code>include = "water"</code> to <em>only</em> include data over waterbodies, and exclude all data over land.</p>
</section>
<section id="downloading-even-more-data" class="level2">
<h2 class="anchored" data-anchor-id="downloading-even-more-data">Downloading even more data</h2>
<p>Two new functions in this release provide friendly wrappers around <code>get_stac_data()</code> to help you access specific data sources.</p>
<p>First, thanks to <span class="citation" data-cites="h-a-graham">@h-a-graham</span>, the new <code>get_alos_palsar_imagery()</code> function provides a wrapper for accessing data from ALOS PALSAR:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">alos <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_alos_palsar_imagery</span>(</span>
<span id="cb4-2">  aoi,</span>
<span id="cb4-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-06-01"</span>,</span>
<span id="cb4-4">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-09-01"</span></span>
<span id="cb4-5">)</span>
<span id="cb4-6"></span>
<span id="cb4-7">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(alos))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-03-29-rsi-020/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>And separately, the new <code>get_naip_imagery()</code> function provides access to data from the National Agricultural Imagery Program across the United States:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">naip <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_naip_imagery</span>(</span>
<span id="cb5-2">  aoi,</span>
<span id="cb5-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2018-01-01"</span>,</span>
<span id="cb5-4">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-01-01"</span>,</span>
<span id="cb5-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pixel_x_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>,</span>
<span id="cb5-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pixel_y_size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span></span>
<span id="cb5-7">)</span>
<span id="cb5-8"></span>
<span id="cb5-9">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plotRGB</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(naip))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-03-29-rsi-020/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="new-vignette" class="level2">
<h2 class="anchored" data-anchor-id="new-vignette">New vignette</h2>
<p>One of the last significant user-facing changes is the addition of <a href="https://permian-global-research.github.io/rsi/articles/How-can-I-.html">a new vignette, called “How can I…?”</a>. This vignette is meant to collect common use-cases into a single document, providing users with a “cookbook” containing methods they might use to approach their current problems. If you’ve got a use-case that took you a moment to figure out, or a problem that you think rsi <em>should</em> be able to solve, let me know through an issue on GitHub so I can incorporate it into this vignette!</p>
<p>The other improvements in this release focus mostly on bug squashing – including a nasty bug where downloading multiple tiles using <code>composite_function = NULL</code> could fail – and simplifying the internals of <code>get_stac_data()</code> to make it more maintainable and extensible into the future.</p>
</section>
<section id="acknowledgments" class="level2">
<h2 class="anchored" data-anchor-id="acknowledgments">Acknowledgments</h2>
<p>As always, huge thanks to the folks who have been involved in testing and improving this package since our last release: <a href="https://github.com/agronomofiorentini">@agronomofiorentini</a>, <a href="https://github.com/h-a-graham">@h-a-graham</a>, and <a href="https://github.com/mateuszrydzik">@mateuszrydzik</a>. It’s extremely appreciated.</p>


</section>

 ]]></description>
  <category>R</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <category>R packages</category>
  <guid>https://mm218.dev/posts/2024-03-29-rsi-020/</guid>
  <pubDate>Fri, 29 Mar 2024 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2024-03-29-rsi-020/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>rsi is now on CRAN!</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2024-01-10-rsi-cran/</link>
  <description><![CDATA[ 





<p>I’m thrilled to announce that version 0.1.0 of rsi, a new package for handling common spatial ML data pre-processing tasks, is <a href="https://cran.r-project.org/package=rsi">now officially on CRAN</a>. rsi helps you download<sup>1</sup> data from <a href="https://stacspec.org/en">STAC APIs</a>, calculate spectral indices from that data (with an interface to the <a href="https://github.com/davemlz/awesome-spectral-indices">Awesome Spectral Indices project</a>), and efficiently stack rasters together to help build predictor bricks.</p>
<p>You can install it from CRAN via:</p>
<div class="code-copy-outer-scaffold"><div class="sourceCode" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">install.packages</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>)</span></code></pre></div></div>
<p>rsi is motivated by my own work doing landscape-level natural resources mapping, both with <a href="https://permianglobal.com/">Permian Global Research</a> (who have been incredibly supportive of me open-sourcing this project that originated from my contract work) and with <a href="https://cafri-ny.org/">CAFRI</a> (see for instance <a href="https://doi.org/10.1016/j.foreco.2023.121348">our newest Landsat-based forest carbon maps</a>). Most of these projects have a pretty consistent data-processing workflow: I need to go get data from somewhere, calculate indices and other predictors against it, and then glue all those derived products together into a predictor brick I can actually extract my predictor values from. rsi handles every step of that process and does so efficiently, letting you spend less time on your pre-processing and more time on your actual models.</p>
<p>I’ve written about rsi a few times before, including <a href="https://www.mm218.dev/posts/2023-10-27-minimal-environments/">about how rsi does sandboxing when running code from untrusted sources</a> and <a href="https://www.mm218.dev/posts/2023-11-21-rsi-null/">how to make the most of the <code>get_stac_data()</code> family of functions</a>. This post is going to provide a more holistic introduction to rsi,<sup>2</sup> focusing on the main functions in the package and how you might use them as part of your pre-processing workflows.</p>
<section id="downloading-data-from-stac-apis" class="level2">
<h2 class="anchored" data-anchor-id="downloading-data-from-stac-apis">Downloading data from STAC APIs</h2>
<p>Perhaps the most useful piece of rsi<sup>3</sup> is the <code>get_stac_data()</code> family of functions, which help you download data from <a href="https://stacspec.org/en">STAC APIs</a>. As an example, let’s say that we’ve used sf to define some area of interest that we want to download data for:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">aoi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">76.1376841</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">43.0351335</span>))</span>
<span id="cb2-2">aoi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_set_crs</span>(sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_sfc</span>(aoi), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>)</span>
<span id="cb2-3">aoi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_buffer</span>(sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_transform</span>(aoi, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5070</span>), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>)</span></code></pre></div></div>
</div>
<p>If we wanted to get land cover data for this area from 2021, using <a href="https://planetarycomputer.microsoft.com/dataset/usgs-lcmap-conus-v13">the USGS LCMAP product from Planetary Computer</a>, we could use the <code>get_stac_data()</code> function from rsi like so:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsi)</span>
<span id="cb3-2">future<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plan</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"multisession"</span>)</span>
<span id="cb3-3"></span>
<span id="cb3-4">lcpri <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_stac_data</span>(</span>
<span id="cb3-5">  aoi,</span>
<span id="cb3-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-01-01"</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Making sure we only grab data from 2021</span></span>
<span id="cb3-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-12-31"</span>,</span>
<span id="cb3-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">asset_names =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"lcpri"</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># The name of the primary land cover product in LCMAP</span></span>
<span id="cb3-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stac_source =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"https://planetarycomputer.microsoft.com/api/stac/v1/"</span>,</span>
<span id="cb3-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">collection =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"usgs-lcmap-conus-v13"</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># The name of LCMAP on Planetary Computer</span></span>
<span id="cb3-11">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>),</span>
<span id="cb3-12">)</span>
<span id="cb3-13"></span>
<span id="cb3-14">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(lcpri))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-01-10-rsi-cran/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>If our data is split across multiple tiles, <code>get_stac_data()</code> will automatically merge them into a single composite output. There’s <a href="https://permian-global-research.github.io/rsi/articles/Downloading-data-from-STAC-APIs-using-rsi.html">an article on the rsi website</a> that gives a brief overview of the STAC family of standards, and how you can use the arguments to <code>get_stac_data()</code> to flexibly control exactly what you’re downloading and how.</p>
<p>In addition to <code>get_stac_data()</code>, rsi also provides a number of higher-level functions for interacting with popular satellite imagery data sources – specifically, Landsat, Sentinel-2, and Sentinel-1 (including the Sentinel-1 RTC product). In addition to downloading and merging tiles, these functions will also handle creating composite rasters from multiple separate images, masking out clouds and other low-quality data, and (where possible) automatically rescaling your data using offsets and scaling factors defined in the STAC metadata</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">landsat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb4-2">  aoi,</span>
<span id="cb4-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-06-01"</span>,</span>
<span id="cb4-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2021-08-31"</span>,</span>
<span id="cb4-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb4-6">)</span>
<span id="cb4-7">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-01-10-rsi-cran/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Note that the inputs to these functions are normal R objects, and the outputs are the file paths your data was saved to. A key difference of rsi from other packages for downloading data from STAC endpoints is that rsi doesn’t introduce a new data model,<sup>4</sup> and instead is focused on getting data onto your local filesystem as quickly as possible. Ideally, you’ll be able to let rsi handle all the “cloud-native geospatial” stuff for you, and then use the outputs with whatever tools you’re most comfortable with.<sup>5</sup></p>
<p>For more information on the <code>get_stac_data()</code> family of functions, check out the <a href="https://permian-global-research.github.io/rsi/articles/Downloading-data-from-STAC-APIs-using-rsi.html">corresponding article on the rsi website</a>!</p>
</section>
<section id="calculate-spectral-indices" class="level2">
<h2 class="anchored" data-anchor-id="calculate-spectral-indices">Calculate spectral indices</h2>
<p>Most other functions in rsi also use the approach of accepting normal R objects as function inputs, and returning file paths to raster outputs. For instance, rsi provides a function, <code>calculate_indices()</code>, which can be used to calculate spectral indices from the band values of an input raster.</p>
<p>By default, this function is designed to work with spectral indices from the (awesome) <a href="https://github.com/awesome-spectral-indices/awesome-spectral-indices">Awesome Spectral Indices</a> project. In fact, the <code>spectral_indices()</code> function in rsi will give you a processed tibble containing the full list of indices from that project:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spectral_indices</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 231 × 9
   application_domain bands     contributor   date_of_addition formula long_name
   &lt;chr&gt;              &lt;list&gt;    &lt;chr&gt;         &lt;chr&gt;            &lt;chr&gt;   &lt;chr&gt;    
 1 vegetation         &lt;chr [2]&gt; https://gith… 2021-11-17       (N - 0… Aerosol …
 2 vegetation         &lt;chr [2]&gt; https://gith… 2021-11-17       (N - 0… Aerosol …
 3 water              &lt;chr [6]&gt; https://gith… 2022-09-22       (B + G… Augmente…
 4 vegetation         &lt;chr [2]&gt; https://gith… 2021-09-20       (1 / G… Anthocya…
 5 vegetation         &lt;chr [3]&gt; https://gith… 2022-04-08       N * ((… Anthocya…
 6 vegetation         &lt;chr [4]&gt; https://gith… 2021-05-11       (N - (… Atmosphe…
 7 vegetation         &lt;chr [4]&gt; https://gith… 2021-05-14       sla * … Adjusted…
 8 vegetation         &lt;chr [2]&gt; https://gith… 2022-04-08       (N * (… Advanced…
 9 water              &lt;chr [4]&gt; https://gith… 2021-09-18       4.0 * … Automate…
10 water              &lt;chr [5]&gt; https://gith… 2021-09-18       B + 2.… Automate…
# ℹ 221 more rows
# ℹ 3 more variables: platforms &lt;list&gt;, reference &lt;chr&gt;, short_name &lt;chr&gt;</code></pre>
</div>
</div>
<p>The first time you run <code>spectral_indices()</code> will attempt to download the current set of spectral indices from GitHub, and then will store that download in a cache file. Future calls to <code>spectral_indices()</code> will then be a bit faster, as rsi will only try to update its downloaded indices if the cache is more than a day old.</p>
<p>There are also two functions, <code>filter_bands()</code> and <code>filter_platforms()</code>, which make it easy to filter the tibble of indices based on what bands or platforms you have available in your data. For instance, if we wanted to get the full set of indices that we could calculate with our downloaded Landsat data, we could pass the names of those bands to the second argument in <code>filter_bands()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spectral_indices</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter_bands</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat)))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 128 × 9
   application_domain bands     contributor   date_of_addition formula long_name
   &lt;chr&gt;              &lt;list&gt;    &lt;chr&gt;         &lt;chr&gt;            &lt;chr&gt;   &lt;chr&gt;    
 1 vegetation         &lt;chr [2]&gt; https://gith… 2021-11-17       (N - 0… Aerosol …
 2 vegetation         &lt;chr [2]&gt; https://gith… 2021-11-17       (N - 0… Aerosol …
 3 water              &lt;chr [6]&gt; https://gith… 2022-09-22       (B + G… Augmente…
 4 vegetation         &lt;chr [2]&gt; https://gith… 2022-04-08       (N * (… Advanced…
 5 water              &lt;chr [4]&gt; https://gith… 2021-09-18       4.0 * … Automate…
 6 water              &lt;chr [5]&gt; https://gith… 2021-09-18       B + 2.… Automate…
 7 burn               &lt;chr [2]&gt; https://gith… 2021-04-07       1.0 / … Burned A…
 8 burn               &lt;chr [2]&gt; https://gith… 2022-04-20       1.0/((… Burned A…
 9 vegetation         &lt;chr [3]&gt; https://gith… 2022-01-17       B / (R… Blue Chr…
10 soil               &lt;chr [4]&gt; https://gith… 2022-04-08       ((S1 +… Bare Soi…
# ℹ 118 more rows
# ℹ 3 more variables: platforms &lt;list&gt;, reference &lt;chr&gt;, short_name &lt;chr&gt;</code></pre>
</div>
</div>
<p>And then we can pass this tibble directly to <code>calculate_indices()</code> to calculate all 128 of these indices against our downloaded Landsat image:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">indices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb9-2">  landsat,</span>
<span id="cb9-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter_bands</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spectral_indices</span>(), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(landsat))),</span>
<span id="cb9-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb9-5">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
|---------|---------|---------|---------|
=========================================
                                          </code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(indices))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-01-10-rsi-cran/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>You should always skim the formulas you’re going to run before calling <code>calculate_indices()</code> – these are downloaded from a live GitHub URL and can change over time – but the actual execution of downloaded code happens in a sandboxed environment, which should make it <em>harder</em> for any untrustworthy code to damage your system. If you’re interested, <a href="https://www.mm218.dev/posts/2023-10-27-minimal-environments/">I wrote post a few months ago with a bit more information about this sandboxing</a>.</p>
</section>
<section id="glue-it-all-together" class="level2">
<h2 class="anchored" data-anchor-id="glue-it-all-together">Glue it all together</h2>
<p>And last, but certainly not least, rsi provides a function called <code>stack_rasters()</code> which helps you bind multiple rasters into a single predictor brick. Similar to <code>calculate_indices()</code>, the first argument is a (vector of) file paths to the rasters you want to stack together, and the output is a new file path to the raster that was created:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stack_rasters</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(lcpri, indices), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb12-2">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb12-3">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in CPL_gdalwarp(source, destination, options, oo, doo, config_options,
: GDAL Error 6: /tmp/RtmpvJTqGP/file302c19635cbe42.tif, band 1: SetColorTable()
not supported for multi-sample TIFF files.</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in CPL_gdalwarp(source, destination, options, oo, doo, config_options,
: GDAL Message 1: /tmp/RtmpvJTqGP/file302c19635cbe42.tif, band 2: Setting
nodata to nan on band 2, but band 1 has nodata at 0. The TIFFTAG_GDAL_NODATA
only support one value per dataset. This value of nan will be used for all
bands on re-opening</code></pre>
</div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-01-10-rsi-cran/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>If your output file ends in <code>vrt</code>, you can even do this without copying any data thanks to <a href="https://gdal.org/drivers/raster/vrt.html">GDAL’s virtual raster format</a>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stack_rasters</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(lcpri, indices), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".vrt"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb15-2">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb15-3">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in x@cpp$sampleRegularRaster(size): GDAL Message 6: Resampling method
not supported on paletted band. Falling back to nearest neighbour</code></pre>
</div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2024-01-10-rsi-cran/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This feels like such a small thing when you say it out loud, but I think is my favorite part of the package. This was a surprisingly hard thing to do with existing tools. If your rasters are different extents or resolutions, <code>stack_rasters()</code> will automatically set the extent of your output and your target resolution based on the first raster in your input vector – or you can override that default behavior using the <code>extent</code> and <code>resolution</code> argument.</p>
<p>Crucially, <code>stack_rasters()</code> relies almost entirely on GDAL’s warper and VRT format to do this, meaning it’s able to efficiently stack together <em>much</em> larger than memory data sets. And (particularly when using VRT outputs), this means that <code>stack_rasters()</code> can be a lot faster than approaches that involve reading the raster into R and then writing it back out.</p>
</section>
<section id="acknowledgments" class="level2">
<h2 class="anchored" data-anchor-id="acknowledgments">Acknowledgments</h2>
<p>And that’s rsi version 0.1.0! Hopefully this package can be as useful for others as it’s already been for me at simplifying my data pre-processing workflows. A huge, huge, <em>huge</em> thank you to the folks who have been involved in testing and improving the alpha release, and helping me reshape it into this first release: <a href="https://github.com/agronomofiorentini">@agronomofiorentini</a>, <a href="https://github.com/h-a-graham">@h-a-graham</a>, and <a href="https://github.com/mateuszrydzik">@mateuszrydzik</a>. It would be a worse package without you!</p>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>(and mask, and rescale, and composite)↩︎</p></li>
<li id="fn2"><p>Updating the first post <a href="https://www.mm218.dev/posts/2023-10-26-rsi/">introducing the first alpha development version</a>.↩︎</p></li>
<li id="fn3"><p>And most involved; <a href="https://github.com/AlDanial/cloc">cloc</a> tells me there are 1300 lines of code in the R directory of this package, with 718 of them in <code>get_stac_data.R</code>.↩︎</p></li>
<li id="fn4"><p>Kinda; the band mapping objects are – as discussed in the article linked above – a relatively complex data structure. But <em>hopefully</em> users mostly don’t need to think about those.↩︎</p></li>
<li id="fn5"><p><a href="https://ourenvironment.berkeley.edu/people/carl-boettiger">Carl Boettiger</a> at AGU this year: “Cloud native just means using HTTP range requests. We’ve been doing cloud native since the 90s”.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <category>R packages</category>
  <guid>https://mm218.dev/posts/2024-01-10-rsi-cran/</guid>
  <pubDate>Wed, 10 Jan 2024 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2024-01-10-rsi-cran/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Why is View() capitalized, anyway?</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-12-07-View/</link>
  <description><![CDATA[ 





<p><a href="https://bsky.app/profile/davidjohnbaker.bsky.social/post/3kfxbci3ji22h">Over on BlueSky, David John Baker asks</a>:</p>
<p><img src="https://mm218.dev/posts/2023-12-07-View/post.png" class="img-fluid" alt="Post from David John Baker on BlueSky: Why is does the  `View()` function in R start with a capital V? I have no idea why. Does this have to do with functions that are not just dealing with some sort of standard output? #rstats"></p>
<p>My first thought on reading this was that <code>View()</code> in RStudio is built to mask the <code>View()</code> function from utils, and so capitalizes the V because utils capitalized the V. Things are the way they are because they are the way they are, yet again.</p>
<p>But of course, that’s not the actual question – not only did David specify he’s talking about the function in <em>R</em>, not in <em>RStudio</em>, but of course someone decided that utils should capitalize the <code>View()</code> function as well. And this is weird! There are not that many functions in base R that start with a capital letter:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">getOption</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"defaultPackages"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">setdiff</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"datasets"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(\(x) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">paste0</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"package:"</span>, x), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">character</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ls</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">strsplit</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Filter</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">f =</span> \(x) x[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> LETTERS) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(paste0, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">character</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">collapse =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code> [1] "Rprof"               "Rprofmem"            "RShowDoc"           
 [4] "RSiteSearch"         "Rtangle"             "RtangleFinish"      
 [7] "RtangleRuncode"      "RtangleSetup"        "RtangleWritedoc"    
[10] "RweaveChunkPrefix"   "RweaveEvalWithOpt"   "RweaveLatex"        
[13] "RweaveLatexFinish"   "RweaveLatexOptions"  "RweaveLatexSetup"   
[16] "RweaveLatexWritedoc" "RweaveTryStop"       "Stangle"            
[19] "Sweave"              "SweaveHooks"         "SweaveSyntaxLatex"  
[22] "SweaveSyntaxNoweb"   "SweaveSyntConv"      "URLdecode"          
[25] "URLencode"           "View"               </code></pre>
</div>
</div>
<p>These functions all follow one of three patterns:</p>
<ul>
<li>Functions that start with <code>R</code> or <code>S</code>, capitalized because it’s the name of a programming language;</li>
<li>Functions that start with URL;</li>
<li><code>View()</code> <em>by itself</em>!</li>
</ul>
<p><strong>Update:</strong></p>
<p>A few minutes after publishing, <a href="https://fosstodon.org/@klmr@mastodon.social/111540097245999105">Konrad Rudolph points out on Mastodon</a> that there are plenty of functions in the base package <em>itself</em> that are capitalized and don’t match these rules:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ls</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"package:base"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">strsplit</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Filter</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">f =</span> \(x) x[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%in%</span> LETTERS) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(paste0, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">character</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">collapse =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code> [1] "Arg"                     "Conj"                   
 [3] "Cstack_info"             "Encoding"               
 [5] "Encoding&lt;-"              "F"                      
 [7] "Filter"                  "Find"                   
 [9] "I"                       "Im"                     
[11] "ISOdate"                 "ISOdatetime"            
[13] "La_library"              "La_version"             
[15] "La.svd"                  "LETTERS"                
[17] "Map"                     "Math.data.frame"        
[19] "Math.Date"               "Math.difftime"          
[21] "Math.factor"             "Math.POSIXt"            
[23] "Mod"                     "NCOL"                   
[25] "Negate"                  "NextMethod"             
[27] "NROW"                    "OlsonNames"             
[29] "Ops.data.frame"          "Ops.Date"               
[31] "Ops.difftime"            "Ops.factor"             
[33] "Ops.numeric_version"     "Ops.ordered"            
[35] "Ops.POSIXt"              "Position"               
[37] "R_compiled_by"           "R_system_version"       
[39] "R.home"                  "R.version"              
[41] "R.Version"               "R.version.string"       
[43] "Re"                      "Recall"                 
[45] "Reduce"                  "RNGkind"                
[47] "RNGversion"              "Summary.data.frame"     
[49] "Summary.Date"            "Summary.difftime"       
[51] "Summary.factor"          "Summary.numeric_version"
[53] "Summary.ordered"         "Summary.POSIXct"        
[55] "Summary.POSIXlt"         "Sys.chmod"              
[57] "Sys.Date"                "Sys.getenv"             
[59] "Sys.getlocale"           "Sys.getpid"             
[61] "Sys.glob"                "Sys.info"               
[63] "Sys.localeconv"          "Sys.readlink"           
[65] "Sys.setenv"              "Sys.setFileTime"        
[67] "Sys.setLanguage"         "Sys.setlocale"          
[69] "Sys.sleep"               "Sys.time"               
[71] "Sys.timezone"            "Sys.umask"              
[73] "Sys.unsetenv"            "Sys.which"              
[75] "T"                       "UseMethod"              
[77] "Vectorize"              </code></pre>
</div>
</div>
<p>One of those is <code>Filter()</code>, which I <em>literally used in the original code chunk</em> and somehow didn’t notice started with a capital letter.</p>
<p>These functions fall into a number of other groups – functions for working with complex numbers, for functional programming and recursion, for setting and getting system information and variables, group generic functions, and a few proper nouns and acronyms like “C” and “RNG”. It’s still not <em>common</em> – 77 out of 1268 objects in the base namespace start with a capital letter – but it’s more common than in recommended packages.</p>
<p>&lt; / update &gt;</p>
<p>But even with that said, <code>View()</code> is the only function <em>in recommended packages</em> that starts with a capital letter that’s not named after a proper noun.<sup>1</sup> And it’s been weird for a long time. The original version of the function was also capitalized when it was added by Professor Brian D Ripley way back in 2007:</p>
<p><img src="https://mm218.dev/posts/2023-12-07-View/commit.png" class="img-fluid" alt="Commit by Prof. Brian D Ripley on Feb 19 2007, adding the capitalized `View()` function."></p>
<p>Now here’s where we enter the world of wild speculation. Because if we look at the documentation for <code>View()</code>, we can see that this function is meant to “Invoke a Data Viewer”, with “Data Viewer” capitalized like a proper noun in both the function title and description:</p>
<p><img src="https://mm218.dev/posts/2023-12-07-View/ViewDoc.png" class="img-fluid" alt="The rd documentation file for `View()`. Notably, the Description field capitalizes the phrase Data Viewer."></p>
<p>This isn’t a universal style. If we look for instance at the documentation for <code>edit()</code>, which pre-dates <code>View()</code>,<sup>2</sup> we can see that “text editor” is capitalized in the title but not in the description:</p>
<p><img src="https://mm218.dev/posts/2023-12-07-View/EditDoc.png" class="img-fluid" alt="The rd documentation file for `edit()`. Notably, the Description field _does not_ capitalize the phrase text editor."></p>
<p>So this makes me suspect that Data Viewer here is being used as a proper noun. This wouldn’t be a unique usage; for instance, the <a href="https://docs.posit.co/ide/user/ide/guide/data/data-viewer.html">RStudio user guide also describes its viewer as a Data Viewer, capitalized</a>, and a search for “Data Viewer” suggests the phrase is capitalized something like 2/3 of the time on public websites. I have absolutely no knowledge of the tech jargon of 2007, or of the proper styling of “data viewer”,<sup>3</sup> but it does seem like it’s <em>sometimes</em> a proper noun.</p>
<p>So, to wildly speculate just a little further: is <code>View()</code> capitalized because “Viewer” was capitalized, at least back in 2007?</p>
<p>Now, here’s a decent piece of evidence <em>against</em> this conjecture: the details section of the <code>View()</code> documentation, added at the same time as the title and description,<sup>4</sup> doesn’t capitalize the phrase “data viewer”:</p>
<p><img src="https://mm218.dev/posts/2023-12-07-View/ViewDetails.png" class="img-fluid" alt="The details section of the `View()` documentation. Data viewer is not capitalized."></p>
<p>That said, this is also not unique. That same <a href="https://docs.posit.co/ide/user/ide/guide/data/data-viewer.html#scrolling">RStudio user guide also sometimes writes “data viewer” in lowercase</a>, though in <em>most</em> of the guide it’s treated as a proper noun.</p>
<p>So there’s one guess: maybe <code>View()</code> is capitalized because, like the functions starting with R and S and URL, it’s named after a proper noun. Things are the way they are because they are the way they are, yet again.</p>
<p>If anyone knows more, though – or feels like asking Professor Ripley – please drop me a line. I’d love to know the actual answer.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p><a href="https://www.merriam-webster.com/dictionary/URL">Merriam-Webster calls “URL” a noun</a>, even if it originated as an acronym, get out of my comments.↩︎</p></li>
<li id="fn2"><p>Though I’m not sure who originally wrote it; I get a bit lost in the VCS logs from before base functions were split into sub-packages.↩︎</p></li>
<li id="fn3"><p>And I assume these two things are related!↩︎</p></li>
<li id="fn4"><p>And the function itself; all of this documentation is also still in the official help file.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <guid>https://mm218.dev/posts/2023-12-07-View/</guid>
  <pubDate>Thu, 07 Dec 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-12-07-View/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Helping R find the right methods for unserialized complex objects.</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-11-27-objects-loading-namespaces/</link>
  <description><![CDATA[ 





<p>So here’s a problem you may have encountered. Say you’ve serialized some complicated R object using a function like <code>saveRDS()</code>, which saves your objects as a binary file. For instance, we can take the <code>boston_canopy</code> sf object from spatialsample, which looks like this:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">spatialsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>boston_canopy <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb1-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Simple feature collection with 6 features and 18 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 749383.6 ymin: 2913059 xmax: 801174.4 ymax: 2965741
Projected CRS: NAD83 / Massachusetts Mainland (ftUS)
# A tibble: 6 × 19
  grid_id land_area canopy_gain canopy_loss canopy_no_change canopy_area_2014
  &lt;chr&gt;       &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;            &lt;dbl&gt;            &lt;dbl&gt;
1 AB-4      795045.      15323.       3126.           53676.           56802.
2 I-33      265813.       8849.      11795.           78677.           90472.
3 AO-9      270153        6187.       1184.           26930.           28114.
4 H-10     2691490.      73098.      80362.          345823.          426185.
5 V-7       107890.        219.       3612.             240.            3852.
6 Q-22     2648089.     122211.     154236.         1026632.         1180868.
# ℹ 13 more variables: canopy_area_2019 &lt;dbl&gt;, change_canopy_area &lt;dbl&gt;,
#   change_canopy_percentage &lt;dbl&gt;, canopy_percentage_2014 &lt;dbl&gt;,
#   canopy_percentage_2019 &lt;dbl&gt;, change_canopy_absolute &lt;dbl&gt;,
#   mean_temp_morning &lt;dbl&gt;, mean_temp_evening &lt;dbl&gt;, mean_temp &lt;dbl&gt;,
#   mean_heat_index_morning &lt;dbl&gt;, mean_heat_index_evening &lt;dbl&gt;,
#   mean_heat_index &lt;dbl&gt;, geometry &lt;MULTIPOLYGON [US_survey_foot]&gt;</code></pre>
</div>
</div>
<p>And save it out as an RDS file:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">bos_rds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".rds"</span>)</span>
<span id="cb3-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">saveRDS</span>(spatialsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>boston_canopy, bos_rds)</span></code></pre></div></div>
</div>
<p>When you unserialize that file, R won’t automatically be able to find all the methods associated with your complicated object. For instance, in a new session, our <code>boston_canopy</code> data doesn’t print nearly as nicely:<sup>1</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">console_output <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>()</span>
<span id="cb4-2">callr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">r</span>(</span>
<span id="cb4-3">  \(bos_rds) {</span>
<span id="cb4-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(bos_rds) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb4-5">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb4-6">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>()</span>
<span id="cb4-7">    <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span></span>
<span id="cb4-8">  },</span>
<span id="cb4-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bos_rds =</span> bos_rds),</span>
<span id="cb4-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stdout =</span> console_output</span>
<span id="cb4-11">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb4-12">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">invisible</span>()</span>
<span id="cb4-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readLines</span>(console_output)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code> [1] "  grid_id land_area canopy_gain canopy_loss canopy_no_change canopy_area_2014"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
 [2] "1    AB-4  795044.8    15323.45    3126.004         53676.05         56802.05"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
 [3] "  canopy_area_2019 change_canopy_area change_canopy_percentage"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
 [4] "1          68999.5           12197.44                  21.4736"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
 [5] "  canopy_percentage_2014 canopy_percentage_2019 change_canopy_absolute"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
 [6] "1                7.14451               8.678693               1.534183"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
 [7] "  mean_temp_morning mean_temp_evening mean_temp mean_heat_index_morning"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
 [8] "1          75.72113          86.04341  91.51711                76.97335"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
 [9] "  mean_heat_index_evening mean_heat_index"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[10] "1                90.91202         96.9585"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[11] "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                geometry"
[12] "1 781922.7, 781441.9, 780424.1, 780361.1, 780386.0, 780385.0, 780385.7, 780386.0, 780386.7, 780388.0, 780389.3, 780390.9, 780392.8, 780394.5, 780394.8, 780394.8, 780394.1, 780393.5, 780391.9, 780391.5, 780393.2, 780393.8, 780394.1, 780395.1, 780394.5, 780414.6, 780415.6, 780416.9, 780429.8, 780430.5, 780430.5, 780519.8, 780522.4, 780522.7, 780523.0, 780523.4, 780523.7, 780524.0, 780524.3, 780524.3, 780524.7, 780525.0, 780525.6, 780526.3, 780527.3, 780528.6, 780529.9, 780531.2, 780532.8, 780534.1, 780535.7, 780537.6, 780539.6, 780541.5, 780543.8, 780546.1, 780548.4, 780550.6, 780552.9, 780555.2, 780557.5, 780559.7, 780562.0, 780564.3, 780566.5, 780568.2, 780570.1, 780571.7, 780573.4, 780575.0, 780576.6, 780577.6, 780578.9, 780579.9, 780580.5, 780581.5, 780582.1, 780582.8, 780584.1, 780585.4, 780586.0, 780587.0, 780587.0, 780586.4, 780586.0, 780585.7, 780585.4, 780584.7, 780584.4, 780584.4, 780583.1, 780582.5, 780582.1, 780582.1, 780582.1, 780582.8, 780583.4, 780584.1, 780584.4, 780585.4, 780586.0, 780586.7, 780587.6, 780588.6, 780589.6, 780590.9, 780591.9, 780594.5, 780596.7, 780599.3, 780601.3, 780602.3, 780664.3, 780665.3, 780665.3, 780669.1, 780676.3, 780683.1, 780705.5, 780713.0, 780713.0, 780712.0, 780751.0, 780750.6, 780750.3, 780764.0, 780803.6, 780809.1, 780809.1, 780809.1, 780810.1, 780811.0, 780812.7, 780814.3, 780816.6, 780817.9, 780819.5, 780821.4, 780822.7, 780824.0, 780825.3, 780827.0, 780828.6, 780829.9, 780831.5, 780842.2, 780846.1, 780858.1, 780856.2, 780855.9, 780856.2, 780856.2, 780856.2, 780856.5, 780856.5, 780856.8, 780857.2, 780857.5, 780857.8, 780858.5, 780858.8, 780859.4, 780859.7, 780860.7, 780861.4, 780862.0, 780863.0, 780864.0, 780864.6, 780865.6, 780866.6, 780867.2, 780868.2, 780869.2, 780870.1, 780871.1, 780871.8, 780872.7, 780873.7, 780874.7, 780875.7, 780876.6, 780877.6, 780878.6, 780879.6, 780880.2, 780881.2, 780881.8, 780882.8, 780883.5, 780884.4, 780885.1, 780885.4, 780886.1, 780886.4, 780887.3, 780888.0, 780890.3, 780892.9, 780894.5, 780895.5, 780944.5, 780945.1, 780980.5, 781028.3, 781076.6, 781123.7, 781155.6, 781169.2, 781167.6, 781163.7, 781166.3, 781165.6, 781186.7, 781186.7, 781210.8, 781211.1, 781306.2, 781340.6, 781341.9, 781373.4, 781399.7, 781431.2, 781449.7, 781463.4, 781483.8, 781483.8, 781485.8, 781488.7, 781490.6, 781491.9, 781492.9, 781493.9, 781494.2, 781495.8, 781496.2, 781495.8, 781496.2, 781496.2, 781497.1, 781497.8, 781498.8, 781499.7, 781500.4, 781500.7, 781501.4, 781501.7, 781502.0, 781502.7, 781503.0, 781504.0, 781505.3, 781506.2, 781507.2, 781509.8, 781510.8, 781512.1, 781512.7, 781513.4, 781514.0, 781514.3, 781514.7, 781514.7, 781515.3, 781515.6, 781516.3, 781516.6, 781517.3, 781517.9, 781519.2, 781520.5, 781521.5, 781522.1, 781523.1, 781523.8, 781524.7, 781525.7, 781526.0, 781526.4, 781527.0, 781527.3, 781529.3, 781534.2, 781539.0, 781543.2, 781544.2, 781545.2, 781545.8, 781546.5, 781547.1, 781547.8, 781548.8, 781549.7, 781551.0, 781552.3, 781553.6, 781554.9, 781555.9, 781557.2, 781557.5, 781558.2, 781558.8, 781559.2, 781559.5, 781560.1, 781560.1, 781560.8, 781564.0, 781566.0, 781567.9, 781570.5, 781571.8, 781572.5, 781572.8, 781573.4, 781574.1, 781574.7, 781575.4, 781576.0, 781576.4, 781577.0, 781577.7, 781579.9, 781582.9, 781585.5, 781588.1, 781591.0, 781593.2, 781595.5, 781597.8, 781598.8, 781599.7, 781602.3, 781608.2, 781614.0, 781617.3, 781620.5, 781626.4, 781629.3, 781632.2, 781636.1, 781639.4, 781642.3, 781644.6, 781646.5, 781648.4, 781650.4, 781651.4, 781652.3, 781653.3, 781654.3, 781656.6, 781658.8, 781664.7, 781670.5, 781677.7, 781682.5, 781684.8, 781687.1, 781689.4, 781690.7, 781691.6, 781692.6, 781693.6, 781695.2, 781696.8, 781698.5, 781700.4, 781702.3, 781703.3, 781704.0, 781704.9, 781705.6, 781706.6, 781708.2, 781710.5, 781712.7, 781714.7, 781715.3, 781716.3, 781717.6, 781718.3, 781718.9, 781719.2, 781719.2, 781717.9, 781717.3, 781717.0, 781716.6, 781716.6, 781716.3, 781716.6, 781717.0, 781717.3, 781718.9, 781719.2, 781719.2, 781719.6, 781720.2, 781720.2, 781719.9, 781719.6, 781719.2, 781718.6, 781717.6, 781717.3, 781717.0, 781717.0, 781717.3, 781717.6, 781717.9, 781718.3, 781718.9, 781720.2, 781720.9, 781720.9, 781720.9, 781721.2, 781721.2, 781721.8, 781722.5, 781723.5, 781724.1, 781724.8, 781725.1, 781725.7, 781726.7, 781728.3, 781729.0, 781729.9, 781731.2, 781732.5, 781736.8, 781743.6, 781743.6, 781743.3, 781780.0, 781831.3, 781904.3, 781922.7, 2965534.6, 2964701.8, 2964701.8, 2964810.9, 2964811.0, 2964821.4, 2964824.0, 2964826.3, 2964828.6, 2964835.7, 2964847.4, 2964859.4, 2964869.8, 2964880.2, 2964885.4, 2964890.6, 2964894.5, 2964898.4, 2964906.5, 2964966.6, 2965059.1, 2965076.0, 2965081.2, 2965132.8, 2965190.0, 2965190.0, 2965144.8, 2965108.1, 2965107.2, 2965094.5, 2965079.6, 2965081.2, 2965079.2, 2965078.6, 2965077.3, 2965076.3, 2965075.0, 2965074.4, 2965073.4, 2965072.1, 2965070.5, 2965068.8, 2965066.9, 2965065.3, 2965063.6, 2965061.7, 2965060.4, 2965058.8, 2965056.8, 2965055.9, 2965054.6, 2965053.6, 2965052.3, 2965051.3, 2965050.3, 2965049.4, 2965049.0, 2965048.4, 2965048.1, 2965047.7, 2965047.7, 2965048.1, 2965048.1, 2965048.4, 2965049.0, 2965049.7, 2965050.0, 2965050.7, 2965051.3, 2965052.0, 2965053.3, 2965054.2, 2965055.5, 2965056.8, 2965058.1, 2965059.4, 2965061.0, 2965062.0, 2965066.9, 2965071.1, 2965075.7, 2965092.2, 2965108.8, 2965133.8, 2965146.1, 2965159.1, 2965164.6, 2965208.8, 2965222.7, 2965236.7, 2965246.8, 2965252.0, 2965256.8, 2965261.1, 2965265.0, 2965269.5, 2965273.1, 2965274.7, 2965276.0, 2965277.0, 2965278.3, 2965279.6, 2965280.5, 2965281.5, 2965282.5, 2965283.5, 2965284.4, 2965285.1, 2965285.7, 2965286.1, 2965286.1, 2965286.1, 2965287.4, 2965287.4, 2965279.6, 2965279.9, 2965279.6, 2965279.9, 2965281.2, 2965327.0, 2965385.4, 2965445.5, 2965445.5, 2965391.3, 2965325.4, 2965324.7, 2965321.8, 2965359.8, 2965363.7, 2965367.9, 2965373.4, 2965378.9, 2965384.5, 2965390.3, 2965395.2, 2965399.1, 2965402.3, 2965405.9, 2965406.5, 2965407.2, 2965408.2, 2965408.5, 2965409.1, 2965409.5, 2965409.8, 2965410.1, 2965409.8, 2965408.8, 2965448.1, 2965449.7, 2965451.0, 2965451.3, 2965452.0, 2965453.0, 2965453.9, 2965454.3, 2965454.9, 2965455.6, 2965456.2, 2965456.9, 2965457.5, 2965458.2, 2965458.5, 2965459.1, 2965459.8, 2965460.1, 2965460.4, 2965460.8, 2965461.1, 2965461.4, 2965461.4, 2965461.4, 2965461.7, 2965461.7, 2965461.4, 2965461.4, 2965461.1, 2965461.1, 2965460.8, 2965460.4, 2965460.1, 2965459.5, 2965459.1, 2965458.5, 2965457.8, 2965456.9, 2965456.2, 2965455.9, 2965454.3, 2965452.3, 2965451.3, 2965450.0, 2965448.4, 2965447.1, 2965445.5, 2965436.4, 2965419.8, 2965392.9, 2965363.7, 2965340.0, 2965318.9, 2965318.5, 2965300.4, 2965301.7, 2965302.3, 2965302.6, 2965303.6, 2965304.6, 2965303.6, 2965376.0, 2965416.6, 2965441.6, 2965741.3, 2965738.1, 2965442.2, 2965442.9, 2965359.1, 2965359.1, 2965488.7, 2965493.2, 2965490.6, 2965487.4, 2965485.1, 2965484.5, 2965483.2, 2965481.9, 2965465.0, 2965463.4, 2965460.8, 2965456.5, 2965452.6, 2965449.1, 2965445.8, 2965443.5, 2965430.9, 2965426.3, 2965424.7, 2965423.1, 2965421.8, 2965419.5, 2965417.2, 2965415.3, 2965413.0, 2965412.7, 2965412.1, 2965411.4, 2965411.1, 2965410.8, 2965410.4, 2965410.4, 2965409.5, 2965409.1, 2965408.8, 2965408.8, 2965408.5, 2965408.5, 2965408.2, 2965407.8, 2965407.8, 2965407.5, 2965407.2, 2965407.2, 2965406.9, 2965406.9, 2965406.2, 2965405.9, 2965405.6, 2965405.2, 2965404.9, 2965404.6, 2965403.9, 2965403.6, 2965403.0, 2965402.0, 2965401.0, 2965399.7, 2965398.4, 2965397.4, 2965396.8, 2965396.1, 2965395.5, 2965393.9, 2965390.6, 2965388.3, 2965386.4, 2965386.1, 2965385.7, 2965385.7, 2965385.7, 2965385.7, 2965385.7, 2965385.4, 2965385.1, 2965384.5, 2965383.8, 2965383.2, 2965382.2, 2965381.2, 2965380.2, 2965379.9, 2965379.3, 2965378.6, 2965377.6, 2965376.3, 2965375.0, 2965373.1, 2965369.8, 2965358.5, 2965352.3, 2965346.1, 2965340.3, 2965338.3, 2965337.7, 2965337.0, 2965336.7, 2965336.1, 2965335.7, 2965335.1, 2965334.8, 2965334.8, 2965334.4, 2965334.1, 2965333.8, 2965334.1, 2965334.1, 2965334.4, 2965334.4, 2965334.1, 2965333.5, 2965332.8, 2965332.5, 2965332.2, 2965331.8, 2965331.8, 2965331.8, 2965331.8, 2965331.5, 2965331.5, 2965331.5, 2965331.5, 2965330.9, 2965330.2, 2965329.3, 2965328.6, 2965327.6, 2965326.0, 2965325.0, 2965324.1, 2965323.4, 2965322.8, 2965322.4, 2965321.8, 2965321.1, 2965320.8, 2965320.5, 2965320.5, 2965321.1, 2965321.5, 2965322.1, 2965322.8, 2965323.1, 2965323.7, 2965324.1, 2965325.0, 2965326.7, 2965328.3, 2965330.2, 2965333.1, 2965335.4, 2965337.4, 2965339.3, 2965341.9, 2965344.8, 2965348.4, 2965353.6, 2965358.8, 2965364.0, 2965369.8, 2965374.4, 2965379.3, 2965384.8, 2965387.4, 2965390.3, 2965394.8, 2965401.3, 2965430.6, 2965435.1, 2965436.1, 2965437.1, 2965438.4, 2965439.7, 2965440.3, 2965441.3, 2965443.5, 2965459.1, 2965462.7, 2965466.3, 2965470.2, 2965473.7, 2965477.3, 2965481.2, 2965484.1, 2965487.7, 2965491.0, 2965494.2, 2965498.1, 2965501.7, 2965505.9, 2965509.8, 2965511.7, 2965513.0, 2965514.0, 2965515.0, 2965518.6, 2965520.8, 2965521.2, 2965521.8, 2965522.8, 2965523.1, 2965523.7, 2965524.7, 2965525.7, 2965526.0, 2965526.7, 2965527.0, 2965527.6, 2965529.3, 2965532.2, 2965533.8, 2965535.4, 2965536.7, 2965539.0, 2965546.8, 2965554.6, 2965566.9, 2965589.3, 2965578.3, 2965563.0, 2965540.3, 2965534.6"</code></pre>
</div>
</div>
<p>What on Earth is up with that geometry column? That’s unworkable.</p>
<p>Before we dig into this further, let’s write a quick wrapper function that will handle the <code>console_output</code> and <code>readLines()</code> dance we just did for the rest of this post:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">run_isolated <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(..., <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">libpath =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">.libPaths</span>()) {</span>
<span id="cb6-2">  console_output <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>()</span>
<span id="cb6-3">  callr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">r</span>(</span>
<span id="cb6-4">    ...,</span>
<span id="cb6-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">libpath =</span> libpath,</span>
<span id="cb6-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">stdout =</span> console_output</span>
<span id="cb6-7">  )</span>
<span id="cb6-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readLines</span>(console_output)</span>
<span id="cb6-9">}</span></code></pre></div></div>
</div>
<p>I’ve also added a <code>libpath</code> argument, which will let me control what packages these new R sessions are able to access. By default, R sessions run via <code>run_isolated()</code> will have access to all the libraries installed on my machine:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">run_isolated</span>(</span>
<span id="cb7-2">  \() <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">requireNamespace</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sf"</span>))</span>
<span id="cb7-3">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "[1] TRUE"</code></pre>
</div>
</div>
<p>But because I’ve got all my packages installed into a user library (not the system library), I can use that new <code>libpath</code> argument to make it so these R sessions are <em>entirely</em> isolated, without access to any non-base packages at all:<sup>2</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">run_isolated</span>(</span>
<span id="cb9-2">  \() <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">requireNamespace</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sf"</span>)),</span>
<span id="cb9-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">libpath =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/dev/null"</span></span>
<span id="cb9-4">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "[1] FALSE"</code></pre>
</div>
</div>
<p>Anyway, back to the point. Complicated objects print ugly, and don’t have any of their other methods available, when you unserialize them in a new session.</p>
<p>The reason for this is pretty straightforward: the print (and other) methods are inside the namespace of the package that created the objects,<sup>3</sup> if that namespace isn’t loaded then R can’t find the methods when they’re needed, and unserializing the object doesn’t automatically load the relevant namespace. So the reason that <code>boston_canopy</code> printed nicely in our current session is that spatialsample sneakily loaded sf in the background when we called the <code>boston_canopy</code> object:<sup>4</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">is_sf_loaded <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>() {</span>
<span id="cb11-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sessionInfo</span>()[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"otherPkgs"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"loadedOnly"</span>)] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb11-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(names) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb11-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">grepl</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pattern =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sf"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb11-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">any</span>()</span>
<span id="cb11-6">}</span>
<span id="cb11-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is_sf_loaded</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>But when we unserialize the object this doesn’t happen:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1">callr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">r</span>(</span>
<span id="cb13-2">  \(bos_rds, is_sf_loaded) {</span>
<span id="cb13-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(bos_rds)</span>
<span id="cb13-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is_sf_loaded</span>()</span>
<span id="cb13-5">  },</span>
<span id="cb13-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bos_rds =</span> bos_rds, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">is_sf_loaded =</span> is_sf_loaded)</span>
<span id="cb13-7">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] FALSE</code></pre>
</div>
</div>
<p>Because <code>print.sf()</code> is defined in sf, and sf isn’t loaded, we fall back to the default, ugly print method.</p>
<p>This has broader-reaching impacts. For instance, a number of dplyr functions fail for seemingly nonsensical reasons if they’re used on a unserialized object:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1">callr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">r</span>(</span>
<span id="cb15-2">  \(bos_rds) {</span>
<span id="cb15-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(bos_rds) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb15-4">      dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb15-5">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>()</span>
<span id="cb15-6">  },</span>
<span id="cb15-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bos_rds =</span> bos_rds)</span>
<span id="cb15-8">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "Error in vec_size() : \n  `x` must be a vector, not a &lt;sfc_MULTIPOLYGON/sfc&gt; object.\n"
attr(,"class")
[1] "try-error"
attr(,"condition")
&lt;error/vctrs_error_scalar_type&gt;
Error in `vec_size()`:
! `x` must be a vector, not a &lt;sfc_MULTIPOLYGON/sfc&gt; object.
---
Backtrace:
     ▆
  1. ├─base::tryCatch(...)
  2. │ └─base (local) tryCatchList(expr, classes, parentenv, handlers)
  3. │   ├─base (local) tryCatchOne(...)
  4. │   │ └─base (local) doTryCatch(return(expr), name, parentenv, handler)
  5. │   └─base (local) tryCatchList(expr, names[-nh], parentenv, handlers[-nh])
  6. │     └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
  7. │       └─base (local) doTryCatch(return(expr), name, parentenv, handler)
  8. ├─base::withCallingHandlers(...)
  9. ├─base::saveRDS(...)
 10. ├─base::do.call(...)
 11. ├─base (local) `&lt;fn&gt;`(...)
 12. ├─global `&lt;fn&gt;`(bos_rds = base::quote("/tmp/Rtmpb70291/file7571358e23a4.rds"))
 13. │ ├─base::try(dplyr::arrange(readRDS(bos_rds)))
 14. │ │ └─base::tryCatch(...)
 15. │ │   └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 16. │ │     └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 17. │ │       └─base (local) doTryCatch(return(expr), name, parentenv, handler)
 18. │ ├─dplyr::arrange(readRDS(bos_rds))
 19. │ └─dplyr:::arrange.data.frame(readRDS(bos_rds))
 20. │   ├─dplyr::dplyr_row_slice(.data, loc)
 21. │   └─dplyr:::dplyr_row_slice.data.frame(.data, loc)
 22. │     ├─dplyr::dplyr_reconstruct(vec_slice(data, i), data)
 23. │     │ └─dplyr:::dplyr_new_data_frame(data)
 24. │     │   ├─row.names %||% .row_names_info(x, type = 0L)
 25. │     │   └─base::.row_names_info(x, type = 0L)
 26. │     └─vctrs::vec_slice(data, i)
 27. └─vctrs:::stop_scalar_type(`&lt;fn&gt;`(`&lt;s_MULTIP&gt;`), "x", `&lt;fn&gt;`(vec_size()))
 28.   └─vctrs:::stop_vctrs(...)</code></pre>
</div>
</div>
<p>Just like with printing, the root cause here is that R can’t find the <code>arrange.sf()</code> method when sf isn’t loaded, and winds up using the basic data frame method instead (which then errors out).<sup>5</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1">sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:::</span>arrange.sf</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>function (.data, ..., .dots) 
{
    sf_column_name = attr(.data, "sf_column")
    class(.data) = setdiff(class(.data), "sf")
    st_as_sf(NextMethod(), sf_column_name = sf_column_name)
}
&lt;bytecode: 0x558e252883b0&gt;
&lt;environment: namespace:sf&gt;</code></pre>
</div>
</div>
<p>The fix is pretty straightforward: load sf (or whatever package has the methods you’re looking for). That’ll let R find your methods, arrange your data, print all pretty, and do everything else you want:<sup>6</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb19-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">run_isolated</span>(</span>
<span id="cb19-2">  \(bos_rds) {</span>
<span id="cb19-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(sf)</span>
<span id="cb19-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(bos_rds) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb19-5">      dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb19-6">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb19-7">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>()</span>
<span id="cb19-8">  },</span>
<span id="cb19-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bos_rds =</span> bos_rds)</span>
<span id="cb19-10">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code> [1] "Simple feature collection with 1 feature and 18 fields"                       
 [2] "Geometry type: MULTIPOLYGON"                                                  
 [3] "Dimension:     XY"                                                            
 [4] "Bounding box:  xmin: 780361.1 ymin: 2964702 xmax: 781922.7 ymax: 2965741"     
 [5] "Projected CRS: NAD83 / Massachusetts Mainland (ftUS)"                         
 [6] "# A tibble: 1 × 19"                                                           
 [7] "  grid_id land_area canopy_gain canopy_loss canopy_no_change canopy_area_2014"
 [8] "  &lt;chr&gt;       &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;            &lt;dbl&gt;            &lt;dbl&gt;"
 [9] "1 AB-4      795045.      15323.       3126.           53676.           56802."
[10] "# ℹ 13 more variables: canopy_area_2019 &lt;dbl&gt;, change_canopy_area &lt;dbl&gt;,"     
[11] "#   change_canopy_percentage &lt;dbl&gt;, canopy_percentage_2014 &lt;dbl&gt;,"            
[12] "#   canopy_percentage_2019 &lt;dbl&gt;, change_canopy_absolute &lt;dbl&gt;,"              
[13] "#   mean_temp_morning &lt;dbl&gt;, mean_temp_evening &lt;dbl&gt;, mean_temp &lt;dbl&gt;,"       
[14] "#   mean_heat_index_morning &lt;dbl&gt;, mean_heat_index_evening &lt;dbl&gt;,"            
[15] "#   mean_heat_index &lt;dbl&gt;, geometry &lt;MULTIPOLYGON [US_survey_foot]&gt;"          </code></pre>
</div>
</div>
<p>This all might sound pretty familiar; I wrote a post about <a href="https://www.mm218.dev/posts/2022-12-01-sf-in-packages/">more or less about this a year ago</a>. So why bring it up again?</p>
<p>Well, because I saw <a href="https://github.com/thomasp85/patchwork/commit/f7fbab5452c3545211724fee5f9303106ed9b257">a cool trick in patchwork</a> that is currently <a href="https://github.com/r-spatial/sf/pull/2212">in the development version of sf</a> that I like a lot better than the one I wrote about last year. It’s possible for these unserialized objects to load their relevant packages themselves, making all their methods available as soon as they exist in your R session. For this example to work on your machine, you’re going to need the development version (at time of writing) of sf:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">packageVersion</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sf"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"1.0-14"</span></span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>Let’s create a new RDS file, this time using an object that was created by the development version of sf:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb23-1">nc_rds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".rds"</span>)</span>
<span id="cb23-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"shape/nc.shp"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sf"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb23-3">  sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_read</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb23-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">saveRDS</span>(nc_rds)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Reading layer `nc' from data source 
  `/home/mikemahoney218/R/x86_64-pc-linux-gnu-library/4.3/sf/shape/nc.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 100 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
Geodetic CRS:  NAD27</code></pre>
</div>
</div>
<p>Just like before, let’s unserialize this RDS file and print it to our console:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb25-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">run_isolated</span>(</span>
<span id="cb25-2">  \(nc_rds) {</span>
<span id="cb25-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(nc_rds) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb25-4">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb25-5">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>()</span>
<span id="cb25-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">invisible</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span>)</span>
<span id="cb25-7">  },</span>
<span id="cb25-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nc_rds =</span> nc_rds)</span>
<span id="cb25-9">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "Simple feature collection with 1 feature and 14 fields"                        
[2] "Geometry type: MULTIPOLYGON"                                                   
[3] "Dimension:     XY"                                                             
[4] "Bounding box:  xmin: -81.74107 ymin: 36.23436 xmax: -81.23989 ymax: 36.58965"  
[5] "Geodetic CRS:  NAD27"                                                          
[6] "   AREA PERIMETER CNTY_ CNTY_ID NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74"
[7] "1 0.114     1.442  1825    1825 Ashe 37009  37009        5  1091     1      10"
[8] "  BIR79 SID79 NWBIR79                       geometry"                          
[9] "1  1364     0      19 MULTIPOLYGON (((-81.47276 3..."                          </code></pre>
</div>
</div>
<p>Somehow, magically, R found the right print method!</p>
<p>R can now also find the right methods for other functions, like <code>arrange()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb27" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb27-1">callr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">r</span>(</span>
<span id="cb27-2">  \(nc_rds) {</span>
<span id="cb27-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(nc_rds) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb27-4">      dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">arrange</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb27-5">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>()</span>
<span id="cb27-6">  },</span>
<span id="cb27-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nc_rds =</span> nc_rds)</span>
<span id="cb27-8">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb27-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Simple feature collection with 1 feature and 14 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -81.74107 ymin: 36.23436 xmax: -81.23989 ymax: 36.58965
Geodetic CRS:  NAD27
   AREA PERIMETER CNTY_ CNTY_ID NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74
1 0.114     1.442  1825    1825 Ashe 37009  37009        5  1091     1      10
  BIR79 SID79 NWBIR79                       geometry
1  1364     0      19 MULTIPOLYGON (((-81.47276 3...</code></pre>
</div>
</div>
<p>So what’s the trick? Well, under the hood these new sf objects are quietly loading sf when they get unserialized:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb29" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb29-1">callr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">r</span>(</span>
<span id="cb29-2">  \(nc_rds, is_sf_loaded) {</span>
<span id="cb29-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(nc_rds)</span>
<span id="cb29-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is_sf_loaded</span>()</span>
<span id="cb29-5">  },</span>
<span id="cb29-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nc_rds =</span> nc_rds, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">is_sf_loaded =</span> is_sf_loaded)</span>
<span id="cb29-7">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>The magic here is that sf objects now have an attribute, <code>.sf_namespace</code>, that’s a simple stub function defined in the sf namespace:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb31" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb31-1">nc_from_rds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(nc_rds)</span>
<span id="cb31-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">attr</span>(nc_from_rds, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".sf_namespace"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>function () 
NULL
&lt;bytecode: 0x558e2d2322d8&gt;
&lt;environment: namespace:sf&gt;</code></pre>
</div>
</div>
<p>That attribute – which takes up nearly no RAM or disk space – is enough to cause R to automatically load the sf namespace when these objects are unserialized. You now magically have access to all the methods for your complex objects right out of the box.</p>
<p>What if sf isn’t installed? Then it doesn’t get loaded. But that means objects fall back to their default methods, which isn’t <em>great</em> but seems <em>fine</em>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb33" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb33-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">run_isolated</span>(</span>
<span id="cb33-2">  \(nc_rds) {</span>
<span id="cb33-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readRDS</span>(nc_rds) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb33-4">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb33-5">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">print</span>()</span>
<span id="cb33-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">invisible</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span>)</span>
<span id="cb33-7">  },</span>
<span id="cb33-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">args =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nc_rds =</span> nc_rds),</span>
<span id="cb33-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">libpath =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/dev/null"</span></span>
<span id="cb33-10">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "   AREA PERIMETER CNTY_ CNTY_ID NAME  FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[2] "1 0.114     1.442  1825    1825 Ashe 37009  37009        5  1091     1      10"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[3] "  BIR79 SID79 NWBIR79"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
[4] "1  1364     0      19"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
[5] "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               geometry"
[6] "1 -81.47276, -81.54084, -81.56198, -81.63306, -81.74107, -81.69828, -81.70280, -81.67000, -81.34530, -81.34754, -81.32478, -81.31332, -81.26624, -81.26284, -81.24069, -81.23989, -81.26424, -81.32899, -81.36137, -81.36569, -81.35413, -81.36745, -81.40639, -81.41233, -81.43104, -81.45289, -81.47276, 36.23436, 36.27251, 36.27359, 36.34069, 36.39178, 36.47178, 36.51934, 36.58965, 36.57286, 36.53791, 36.51368, 36.48070, 36.43721, 36.40504, 36.37942, 36.36536, 36.35241, 36.36350, 36.35316, 36.33905, 36.29972, 36.27870, 36.28505, 36.26729, 36.26072, 36.23959, 36.23436"</code></pre>
</div>
</div>
<p>I think this is really cool! It feels like a user-friendly way to make unserialized complex objects work like you’d expect them to, and prevents confusing error chains like the ones above.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>Note that we need to use the <code>stdout</code> argument to capture how this object actually prints to R’s console, and to then use <code>readLines()</code> to print the contents of that file into our current session.↩︎</p></li>
<li id="fn2"><p>Thanks to <a href="https://fosstodon.org/@gaborcsardi/111472185756408510">Gabor on Mastodon</a> for teaching me that R will always have access to the system library.↩︎</p></li>
<li id="fn3"><p>I mean, mostly. Some packages like <a href="https://broom.tidymodels.org">broom</a> provide a bunch of methods for objects that don’t come from that package. But those packages aren’t automatically loaded on unserialization either (and can you imagine how chaotic a world that would be!), so the basic point stands.↩︎</p></li>
<li id="fn4"><p>More on the trick <a href="https://www.mm218.dev/posts/2022-12-01-sf-in-packages/">in this old blog post</a>.↩︎</p></li>
<li id="fn5"><p>Well, the <em>root</em> root cause is that <a href="https://github.com/r-lib/pillar/issues/552">vctrs can’t calculate the length of sf geometry columns</a>, because they don’t subclass the list class.↩︎</p></li>
<li id="fn6"><p>I’m trying to be very precise with my wording here, so let me highlight that this call to <code>library()</code> is actually <em>attaching</em> sf, when all that’s necessary is <em>loading</em> it. For <em>this post</em> this is a distinction without a difference, so I’m not spending a lot of time on it. But only loading the namespace is necessary, you don’t need to attach it to the search path.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <category>R packages</category>
  <guid>https://mm218.dev/posts/2023-11-27-objects-loading-namespaces/</guid>
  <pubDate>Mon, 27 Nov 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-11-27-objects-loading-namespaces/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Downloading STAC data using rsi when you’ve got a geographic CRS or don’t want a composite.</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-11-21-rsi-null/</link>
  <description><![CDATA[ 





<p>A quick post today, inspired by <a href="https://github.com/Permian-Global-Research/rsi/issues/6">a GitHub issue</a>.</p>
<p>I’ve been working recently on <a href="https://github.com/Permian-Global-Research/rsi">the new rsi package</a> which helps you download, reproject, resample, mask, rescale, and composite data from STAC APIs.<sup>1</sup> The standard function interface does all of these steps: it grabs all the relevant files from your STAC source, reprojects them to match your AOI and desired resolution, masks and rescales the component files, and then merges them into a composite:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsi)</span>
<span id="cb1-2">future<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plan</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"multisession"</span>)</span>
<span id="cb1-3"></span>
<span id="cb1-4">aoi <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">74.912131</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">44.080410</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-5">  sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_sfc</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-6">  sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_set_crs</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-7">  sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_transform</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3857</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-8">  sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_buffer</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>)</span>
<span id="cb1-9"></span>
<span id="cb1-10">start_date <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022-06-01"</span></span>
<span id="cb1-11">end_date <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022-07-01"</span></span>
<span id="cb1-12"></span>
<span id="cb1-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb1-14">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">aoi =</span> aoi,</span>
<span id="cb1-15">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> start_date,</span>
<span id="cb1-16">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> end_date,</span>
<span id="cb1-17">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb1-18">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-19">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-20">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-11-21-rsi-null/index_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>What if you want to skip some of these steps? For instance, if you try to call <code>get_landsat_imagery()</code> with an AOI in geographic coordinates, you’ll get a warning (likely followed by an error) saying that you’re asking to resample the data to 30 degree pixels, which is probably not what you wanted:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb2-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb2-3">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">aoi =</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_transform</span>(aoi, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>),</span>
<span id="cb2-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> start_date,</span>
<span id="cb2-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> end_date,</span>
<span id="cb2-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb2-7">  )</span>
<span id="cb2-8">)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: The default pixel size arguments are intended for use with projected AOIs, but `aoi` appears to be in geographic coordinates.
ℹ Pixel X size: 30. Pixel Y size: 30.
ℹ These dimensions will be interpreted in the same units as `aoi` (likely degrees), which may cause errors.</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in CPL_gdalwarp(source, destination, options, oo, doo, config_options,
: GDAL Error 1: Attempt to create 0x0 dataset is illegal,sizes must be larger
than zero.</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: Failed to download LC08_L2SP_015029_20220617_02_T1 from
2022-06-17T15:45:03.055481Z</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in CPL_gdalwarp(source, destination, options, oo, doo, config_options,
: GDAL Error 1: Attempt to create 0x0 dataset is illegal,sizes must be larger
than zero.</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: Failed to download LC09_L2SP_015029_20220609_02_T2 from
2022-06-09T15:44:23.649712Z</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in CPL_gdalwarp(source, destination, options, oo, doo, config_options,
: GDAL Error 1: Attempt to create 0x0 dataset is illegal,sizes must be larger
than zero.</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: Failed to download LC08_L2SP_015029_20220601_02_T1 from
2022-06-01T15:44:51.569374Z</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in new_CppObject_xp(fields$.module, fields$.pointer, ...): GDAL Error
4: /tmp/Rtmp7E1urK/filedddb8dfac1e.tif: No such file or directory</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>Error : [rast] file does not exist: /tmp/Rtmp7E1urK/filedddb8dfac1e.tif</code></pre>
</div>
</div>
<p>That’s coming from the resampling step of the function’s workflow. Can we just skip that?</p>
<p>Short answer: yes! If we pass <code>NULL</code> to the <code>pixel_*_size</code> arguments, we’ll skip the resampling stage and instead download our data in its native resolution:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb12-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">aoi =</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_transform</span>(aoi, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>),</span>
<span id="cb12-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> start_date,</span>
<span id="cb12-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> end_date,</span>
<span id="cb12-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pixel_x_size =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span>,</span>
<span id="cb12-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">pixel_y_size =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span>,</span>
<span id="cb12-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb12-8">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb12-9">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb12-10">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-11-21-rsi-null/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This is a pattern throughout the rsi API design: if you want to skip something, pass <code>NULL</code> to the relevant argument. For instance (and this is where <a href="https://github.com/Permian-Global-Research/rsi/issues/6">the GitHub issue comes in</a>), if you want to not composite and instead download all the images within your spatiotemporal area of interest, we can pass <code>NULL</code> to the <code>composite_function</code> argument to skip compositing. I’ll also skip masking by passing <code>NULL</code> to the <code>mask_function</code> argument, because otherwise a handful of these images are entirely masked out:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb13-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">aoi =</span> aoi,</span>
<span id="cb13-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">start_date =</span> start_date,</span>
<span id="cb13-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">end_date =</span> end_date,</span>
<span id="cb13-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>),</span>
<span id="cb13-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">composite_function =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span>,</span>
<span id="cb13-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">mask_function =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NULL</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># otherwise half of these images are blank</span></span>
<span id="cb13-8">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb13-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>rast) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb13-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>plot)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-11-21-rsi-null/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-11-21-rsi-null/index_files/figure-html/unnamed-chunk-4-2.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-11-21-rsi-null/index_files/figure-html/unnamed-chunk-4-3.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>[[1]]
NULL

[[2]]
NULL

[[3]]
NULL</code></pre>
</div>
</div>
<p>Hopefully this helps people use rsi to only perform the data wrangling steps they want!</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>And calculate spectral indices from these data, and wrangle multiple rasters into a multi-band VRT – it’s a pretty neat package if I do say so myself.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <category>R packages</category>
  <guid>https://mm218.dev/posts/2023-11-21-rsi-null/</guid>
  <pubDate>Tue, 21 Nov 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-11-21-rsi-null/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Classed conditions from rlang functions</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-11-07-classed-errors/</link>
  <description><![CDATA[ 





<p>I’m a huge fan of the condition functions from rlang – <code>rlang::inform()</code> for sending messages, <code>rlang::warn()</code> for warnings, and <code>rlang::abort()</code> for errors. Compared to their base equivalents (<code>message()</code>, <code>warning()</code>, and <code>stop()</code>, respectively) these functions <a href="https://rlang.r-lib.org/reference/topic-condition-customisation.html">are extremely flexible</a> and make it easy to specify <a href="https://rlang.r-lib.org/reference/topic-error-call.html">which user-facing function actually caused the condition</a>. And recently I’ve become a huge fan of how these functions let you easily set the class of your conditions, which makes it a lot easier to implement logic to handle these conditions.</p>
<p>For instance, let’s say we’ve got some function that sends up a warning if you give it an unexpected input:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">f1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) {</span>
<span id="cb1-2">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is.numeric</span>(x)) {</span>
<span id="cb1-3">    rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">warn</span>(</span>
<span id="cb1-4">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"`x` wasn't numeric. Was this expected?"</span></span>
<span id="cb1-5">    )</span>
<span id="cb1-6">  }</span>
<span id="cb1-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(x)</span>
<span id="cb1-8">}</span>
<span id="cb1-9"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f1</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: `x` wasn't numeric. Was this expected?</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1</code></pre>
</div>
</div>
<p>If we know that we’re going to be passing unexpected inputs to this function, we might consider using <code>suppressWarnings()</code> to hide this warning. I do this every so often in package code, where I know my inputs to another function are going to trigger a condition that I don’t need the user to see:<sup>1</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">suppressWarnings</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f1</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1</code></pre>
</div>
</div>
<p>The challenge with this is that <code>suppressWarnings()</code>, used this way, is a blunt tool that hides <em>all</em> warnings sent up by this function. For instance, if we passed a character vector as input to this function, we’d also trigger a warning from <code>mean()</code> that it’s going to return NA:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f1</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"a"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: `x` wasn't numeric. Was this expected?</code></pre>
</div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in mean.default(x): argument is not numeric or logical: returning NA</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] NA</code></pre>
</div>
</div>
<p>And that useful warning <em>also</em> gets hidden by the <code>suppressWarnings()</code> call:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">suppressWarnings</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f1</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"a"</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] NA</code></pre>
</div>
</div>
<p>Adding a subclass to our warning helps solve this. By specifying the <code>class</code> argument in any of the rlang condition functions, we’re able to easily subclass our warning. This doesn’t change how the warning displays during standard usage:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">f2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) {</span>
<span id="cb12-2">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">is.numeric</span>(x)) {</span>
<span id="cb12-3">    rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">warn</span>(</span>
<span id="cb12-4">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"`x` wasn't numeric. Was this expected?"</span>,</span>
<span id="cb12-5">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"non_numeric_x"</span></span>
<span id="cb12-6">    )</span>
<span id="cb12-7">  }</span>
<span id="cb12-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(x)</span>
<span id="cb12-9">}</span>
<span id="cb12-10"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f2</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: `x` wasn't numeric. Was this expected?</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1</code></pre>
</div>
</div>
<p>But it <em>does</em> mean that we can now use the <code>classes</code> argument to <code>suppressWarnings()</code> to only supress the warnings we care about, without accidentally hiding other unexpected warnings we might trigger:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">suppressWarnings</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f2</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"a"</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">classes =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"non_numeric_x"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning in mean.default(x): argument is not numeric or logical: returning NA</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] NA</code></pre>
</div>
</div>
<p>This is great, and makes it a lot easier to incorporate conditions into your program’s control flow. For instance, we can use these classed warnings with <code>tryCatch()</code> or <code>rlang::try_fetch()</code> to “catch” conditions, perhaps running a cleanup script or fallback method in the event that a specific classed warning is returned:<sup>2</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb18-1">rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try_fetch</span>(</span>
<span id="cb18-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f2</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"a"</span>),</span>
<span id="cb18-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">non_numeric_x =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(...) <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"We're running a completely different function now!"</span></span>
<span id="cb18-4">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "We're running a completely different function now!"</code></pre>
</div>
</div>
<p>Last but not least, classed errors help in package testing. A huge number of my tests are designed to make sure that conditions fire when they’re supposed to – bad inputs trigger errors, concerning outputs trigger warnings and so on. Using classed errors can help me make sure I’m triggering the error or warning that I want to, not just any random error or warning that might be lurking in my code.</p>
<p>If you’re <a href="https://testthat.r-lib.org/articles/third-edition.html">using testthat’s 3rd edition</a>, the <code>expect_condition()</code> set of functions (including <code>expect_message()</code>, <code>expect_warning()</code>, <code>expect_error()</code>) all share a <code>class</code> argument which will make sure the warning or error you’re triggering is actually the one you expect:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1">testthat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">local_edition</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb20-2">testthat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expect_warning</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f2</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"non_numeric_x"</span>)</span></code></pre></div></div>
</div>
<p>If our condition class doesn’t match the expected class, these tests will fail:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(testthat<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expect_warning</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f2</span>(<span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"wrong_class"</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>Warning: `x` wasn't numeric. Was this expected?</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>Error : `f2(TRUE)` did not throw the expected warning.</code></pre>
</div>
</div>
<p>I’m a late adopter of classed conditions, only really systematically adopting them for <a href="https://github.com/Permian-Global-Research/rsi">the new rsi package</a>, but I’ve found them super useful so far and am planning to slowly use them more and more in the rest of my packages over time!</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>For instance, the way that <code>autoplot()</code> in spatialsample <a href="https://github.com/tidymodels/spatialsample/blob/a8834bbe646967bc224cf5e558789e1d704b0778/R/autoplot.R#L141-L143">adds grids to spatial_block_cv() plots</a> always triggers the same message, which is expected and not worth worrying about. I hide that message so my users don’t need to be concerned.↩︎</p></li>
<li id="fn2"><p>I don’t currently, but I <em>should</em> do this in terrainr, where I <a href="https://github.com/ropensci/terrainr/blob/36fc069cb05dbcb44ff358858f5544863d506aee/R/merge_rasters.R#L58-L81">currently assume that any error during <code>merge_rasters()</code> can be fixed by the fallback method.</a>↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Package development</category>
  <guid>https://mm218.dev/posts/2023-11-07-classed-errors/</guid>
  <pubDate>Tue, 07 Nov 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-11-07-classed-errors/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>spatialsample 0.5.0 is now on CRAN</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-11-03-spatialsample/</link>
  <description><![CDATA[ 





<p>The <a href="https://github.com/tidymodels/spatialsample">newest version of spatialsample</a>, the tidymodels package I maintain for spatial cross-validation, just landed on CRAN, with binaries for Windows and Mac coming in the next few days.</p>
<p>This release mostly fixes a few bugs in <code>spatial_block_cv()</code> and <code>spatial_nndm_cv()</code>. The only new feature is that <code>get_rsplit()</code> is now reexported from rsample, providing a nicer interface for extracting individual <code>rsplit</code> objects from an <code>rset</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(spatialsample)</span>
<span id="cb1-2"></span>
<span id="cb1-3">folds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spatial_clustering_cv</span>(boston_canopy, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all.equal</span>(</span>
<span id="cb1-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_rsplit</span>(folds, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),</span>
<span id="cb1-7">  folds<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>splits[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]</span>
<span id="cb1-8">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>More pressing are two sets of breaking changes. The first of these is that <code>spatial_block_cv()</code> now creates slightly different grids, covering a very slightly larger area, which may change what fold any given observation is assigned into. This is to address a problem <a href="https://stackoverflow.com/q/77374348/9625040">reported on StackOverflow</a> where, if data fell exactly on grid lines (which was somewhat common with regularly-spaced grids of data), it would be assigned to both of the polygons on either side of the line.</p>
<p>The amount of grid expansion performed can be controlled using the new <code>expand_bbox</code> argument to <code>spatial_block_cv()</code>. If observations are still assigned to multiple folds, the function will now throw an error:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">drought_sf <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_as_sf</span>(</span>
<span id="cb3-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expand.grid</span>(</span>
<span id="cb3-3">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">995494</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1018714</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">430</span>),</span>
<span id="cb3-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seq</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1019422</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">by =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">430</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">length.out =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">55</span>)</span>
<span id="cb3-5">  ),</span>
<span id="cb3-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">coords =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"x"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>),</span>
<span id="cb3-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">crs =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7760</span></span>
<span id="cb3-8">)</span>
<span id="cb3-9"></span>
<span id="cb3-10"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spatial_block_cv</span>(drought_sf, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand_bbox =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error in generate_folds_from_blocks(data, centroids, grid_blocks, v, n,  : 
  Some observations fell exactly on block boundaries, meaning they were assigned to multiple assessment sets unexpectedly.
ℹ Try setting a different `expand_bbox` value, an `offset`, or use a different number of folds.</code></pre>
</div>
</div>
<p>But hopefully the expansion will make this error relatively uncommon:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">folds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spatial_block_cv</span>(drought_sf)</span>
<span id="cb5-2"></span>
<span id="cb5-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autoplot</span>(folds)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-11-03-spatialsample/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This is a breaking change for data in projected coordinate reference systems. Data in geographic coordinates was actually already using these slightly larger grids, <a href="https://github.com/ropensci/stplanr/pull/467">due to issues with straight-line grids in non-planar CRS</a>, so this change just makes the amount of grid expansion controllable there.</p>
<p>The other breaking change/bug-fix is in <code>spatial_nndm_cv()</code>. The <code>prediction_sites</code> argument to <code>spatial_nndm_cv()</code> lets you specify the actual sites you were going to generate predictions at. In older versions of spatialsample, if any of the data in <code>prediction_sites</code> weren’t points, then this function would instead randomly sample points from inside the bounding box of the entire <code>prediction_sites</code> object.</p>
<p>Starting with spatialsample 0.5.0, passing a single polygon to <code>prediction_sites</code> will cause <code>spatial_nndm_cv()</code> to instead sample points from inside that polygon, allowing you fine-grained control over the boundaries for this sampling stage. This feels like a more intuitive interface, and you can always revert to previous behaviors by passing <code>sf::st_as_sf(sf::st_as_sfc(sf::st_bbox(prediction_sites)))</code> if you’d rather sample from the bounding box instead.</p>



 ]]></description>
  <category>R</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <category>spatialsample</category>
  <category>R packages</category>
  <guid>https://mm218.dev/posts/2023-11-03-spatialsample/</guid>
  <pubDate>Fri, 03 Nov 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-11-03-spatialsample/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Adding context to maps made with ggplot2</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-10-31-map-context/</link>
  <description><![CDATA[ 





<p>A colleague asked me today how they could best add a larger data set for context to a map of a (spatially) smaller data set, without the map expanding to incorporate the whole of the larger data set. I didn’t have a great answer off the top of my head, so this blog post is here to record what we tried, and what wound up working for us!</p>
<p>Say we’ve got some spatial data set that covers a broad area; for instance, the <code>nc</code> data set from sf that contains the counties of North Carolina:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">suppressPackageStartupMessages</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(sf))</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb1-3">nc <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"shape/nc.shp"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sf"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">read_sf</span>()</span>
<span id="cb1-5"></span>
<span id="cb1-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(nc) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb1-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-31-map-context/index_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>We also need a comparatively smaller data set that we’re interested in visualizing. For this blog, simulate some number of observations inside one of the more central counties:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">johnston <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> nc[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">which</span>(nc<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>NAME <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Johnston"</span>), ]</span>
<span id="cb2-2">johnston_obs <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_sample</span>(johnston, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>)</span>
<span id="cb2-3"></span>
<span id="cb2-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb2-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> johnston) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb2-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> johnston_obs, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-31-map-context/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Our goal is to add a bit more context to this map by drawing the borders of surrounding counties. The challenge is that ggplot will, by default, expand our visualization to contain the largest layer that we add:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> nc) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> johnston_obs, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-31-map-context/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>We could control this using <code>expansion()</code> inside of <code>scale_*_continuous()</code> functions, in order to restrict the range of our visualization:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> nc) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> johnston_obs, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expansion</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.63</span>, <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.29</span>))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expansion</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.28</span>)))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-31-map-context/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>But these <code>expansion()</code> calls are relative to the scale of our larger data set, which makes them a bit difficult to reason about. We aren’t specifying our extents in terms of the data that we actually care about visualizing, we’re forced to specify them relative to the larger context that we don’t care as much about.</p>
<p>We could make this a bit easier by filtering our larger data set to only observations that are near (or in this case, touching) the area we’re actually trying to visualize:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">neighbors <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> nc[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_touches</span>(johnston, nc)[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]], ]</span>
<span id="cb5-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> neighbors) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> johnston_obs, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expansion</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span>))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expansion</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.4</span>, <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span>)))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-31-map-context/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>But while this <em>reduces</em> the problem, moving the center of the larger layer closer to the center of the layer we care about, it still has the same issue as when we used the entire <code>nc</code> object.</p>
<p>So what we wound up doing was embracing a little bit of jank and reaching into the ggplot2 internals. We started off by making a “base plot” object that was zoomed out to the level of detail that we wanted:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">base_plot <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(johnston_obs) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expansion</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">expansion</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>))</span>
<span id="cb6-5"></span>
<span id="cb6-6">base_plot</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-31-map-context/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Defining our boundaries here is a bit easier to reason about – the <code>expansion()</code> calls are centered on our data of interest and are expanding the scales relative to this focal data set. Once we’ve got our level of zoom where we want it, we can build our plot using <code>ggplot_build()</code> and then extract the ranges of our x and y scales from the constructed plot object:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">base_plot <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot_build</span>(base_plot)</span>
<span id="cb7-2">xlim <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> base_plot<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>layout<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>panel_params[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>x_range</span>
<span id="cb7-3">ylim <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> base_plot<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>layout<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>panel_params[[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>y_range</span></code></pre></div></div>
</div>
<p>We can then use those ranges as limits, to force our final plot to have the same level of zoom as our simple map. That means we can add whatever layers we want to add context to our map, and not need to worry about fiddling with our scales in term of the largest layer we’ve added:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> nc) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb8-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> johnston_obs, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb8-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> xlim) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb8-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> ylim)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-31-map-context/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>I have no idea how stable this approach will be – <a href="https://www.tidyverse.org/blog/2022/09/playing-on-the-same-team-as-your-dependecy/">we’re decidedly not playing on the same team as ggplot2</a> with this approach – but it works as of ggplot version 3.4.3, and it’s made making maps a bit easier for us at the moment!</p>



 ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <guid>https://mm218.dev/posts/2023-10-31-map-context/</guid>
  <pubDate>Tue, 31 Oct 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-10-31-map-context/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Executing R code from untrusted sources in minimal environments</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-10-27-minimal-environments/</link>
  <description><![CDATA[ 





<p>Yesterday we released <a href="https://github.com/Permian-Global-Research/rsi">rsi</a>, an R package that (among other things) makes it easy to retrieve spectral indices from the <a href="https://github.com/davemlz/awesome-spectral-indices">Awesome Spectral Indices</a> project and calculate them against any images you have on hand.</p>
<p>The actual code required to do these calculations is mostly just a bit of glue. The ASI project provides the formulas you’d use to calculate any spectral indices you might be interested in, which as relatively simple arithmetic are easily <a href="https://www.mm218.dev/posts/2023-10-24-fun-r-funcs/#str2lang">transformed into R code via <code>str2lang()</code></a>.</p>
<p>Once we’ve done that transformation, we can evaluate that R code against our images through a slightly off-label use of <code>terra::predict()</code>. This function loads chunks of our raster into R as a data frame with column names corresponding to our band labels, meaning that if we’re careful to ensure that our band names align with the standardized band names used by ASI<sup>1</sup> we can compute indices by evaluating their formulas “with” the data frame.</p>
<p>This sounds complex, but doesn’t actually require that much code to put together:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">calculate_indices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(raster,</span>
<span id="cb1-2">                              indices,</span>
<span id="cb1-3">                              output_filename) {</span>
<span id="cb1-4">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inherits</span>(raster, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SpatRaster"</span>)) raster <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(raster)</span>
<span id="cb1-5">  formulas <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(indices[[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"formula"</span>]], str2lang)</span>
<span id="cb1-6">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">predict</span>(</span>
<span id="cb1-7">    raster,</span>
<span id="cb1-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(),</span>
<span id="cb1-9">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fun =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(model, newdata) {</span>
<span id="cb1-10">      out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(formulas, \(calc) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">with</span>(newdata, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">eval</span>(calc)))</span>
<span id="cb1-11">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(out) <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> indices[[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"short_name"</span>]]</span>
<span id="cb1-12">      out</span>
<span id="cb1-13">    },</span>
<span id="cb1-14">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filename =</span> output_filename</span>
<span id="cb1-15">  )</span>
<span id="cb1-16">  output_filename</span>
<span id="cb1-17">}</span></code></pre></div></div>
</div>
<p>Turn our formulas into calls, load our raster into R chunk by chunk, evaluate those calls in the context of our raster, lather rinse repeat. The conceptual complexity here is a lot higher, in my view, than the code complexity.</p>
<p>Because this code is pretty straightforward, and relies on the fantastic, highly-optimized <a href="https://github.com/rspatial/terra">terra package</a> to actually do these computations, we’re able to calculate these indices <em>fast</em>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">example_index <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rsi<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spectral_indices</span>()</span>
<span id="cb2-2">example_index <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> example_index[example_index<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>short_name <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"DPDD"</span>, ]</span>
<span id="cb2-3"></span>
<span id="cb2-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.time</span>({</span>
<span id="cb2-5">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb2-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rasters/example_sentinel1.tif"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>),</span>
<span id="cb2-7">    example_index,</span>
<span id="cb2-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb2-9">  )</span>
<span id="cb2-10">})</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>   user  system elapsed 
  1.754   0.040   1.794 </code></pre>
</div>
</div>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(out))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-27-minimal-environments/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This is pretty nifty!</p>
<p>That said, we should be careful about what text we’re willing to turn into R code and execute. In particular, rsi is designed to integrate nicely with the Awesome Spectral Indices project, and to retrieve and compute the ASI set of indices – which, phrased differently, means we’re downloading code from the internet and running it on our computers. If someone were to mess with our indices – either by corrupting the GitHub repository or by editing the cached file on your machine – this could wind up giving them access to <code>system()</code> or other scary commands:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">evil_index <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> example_index</span>
<span id="cb5-2">evil_index<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>formula <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"system('echo oh no &gt; /tmp/example.txt')"</span></span>
<span id="cb5-3"></span>
<span id="cb5-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb5-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb5-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rasters/example_sentinel1.tif"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>),</span>
<span id="cb5-7">    evil_index,</span>
<span id="cb5-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb5-9">  ),</span>
<span id="cb5-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">silent =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span></span>
<span id="cb5-11">)</span>
<span id="cb5-12"></span>
<span id="cb5-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readLines</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/tmp/example.txt"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "oh no"</code></pre>
</div>
</div>
<p>So, how can we make this safer?</p>
<p>One way is by taking away the number of toys any malicious code has available to play with. We can do this by running the code in a locked-down environment, where it won’t have access to functions that might let code mess with our machine.</p>
<p>One way of creating a locked down environment is <code>rlang::new_environment()</code>. By default, this function creates a new environment with nothing in it – no built-in functions or objects:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ls</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">envir =</span> rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new_environment</span>())</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>character(0)</code></pre>
</div>
</div>
<p>This environment is also going to have the empty environment as its parent, meaning that code executed in this scope won’t be able to use functions or objects from the global environment<sup>2</sup>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># inheriting from the global environment</span></span>
<span id="cb9-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">local</span>(</span>
<span id="cb9-3">  <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,</span>
<span id="cb9-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new.env</span>()</span>
<span id="cb9-5">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 4</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># inheriting from the empty environment</span></span>
<span id="cb11-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb11-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">local</span>(</span>
<span id="cb11-4">    <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,</span>
<span id="cb11-5">    rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new_environment</span>()</span>
<span id="cb11-6">  )</span>
<span id="cb11-7">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error in 2 + 2 : could not find function "+"</code></pre>
</div>
</div>
<p>That means that any code we run inside of this new environment will only have access to whatever functions and variables we purposefully include in the environment. The <code>data</code> argument to <code>rlang::new_environment()</code> makes it relatively easy to define whatever objects we’re looking to make available in this new environment:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">local</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new_environment</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">+</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">+</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>)))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 4</code></pre>
</div>
</div>
<p>That means that, if we create a minimal environment containing only the functions and variables essential for calculating our indices, we should hopefully be able to reduce the potential blast radius of malicious code – or at least make it a lot harder for malicious code to impact anything we care about. In rsi, that means we wind up calculating indices inside a minimal environment that looks like this:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1">calculate_indices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(raster,</span>
<span id="cb15-2">                              indices,</span>
<span id="cb15-3">                              output_filename) {</span>
<span id="cb15-4">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inherits</span>(raster, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SpatRaster"</span>)) raster <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(raster)</span>
<span id="cb15-5">  formulas <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(indices[[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"formula"</span>]], str2lang)</span>
<span id="cb15-6">  </span>
<span id="cb15-7">  exec_env <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new_environment</span>(</span>
<span id="cb15-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(</span>
<span id="cb15-9">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">::</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">::</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-10">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-11">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">(</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">(</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-12">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">*</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">*</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-13">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">/</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">/</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-14">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">^</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">^</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-15">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">+</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">+</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-16">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-17">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">{</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">{</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-18">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">names&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">names&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-19">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">function</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">function</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb15-20">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">list =</span> list,</span>
<span id="cb15-21">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lapply =</span> lapply,</span>
<span id="cb15-22">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">with =</span> with,</span>
<span id="cb15-23">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">eval =</span> eval,</span>
<span id="cb15-24">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formulas =</span> formulas,</span>
<span id="cb15-25">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">short_names =</span> indices[[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"short_name"</span>]],</span>
<span id="cb15-26">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">raster =</span> raster,</span>
<span id="cb15-27">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> output_filename</span>
<span id="cb15-28">    )</span>
<span id="cb15-29">  )</span>
<span id="cb15-30">  </span>
<span id="cb15-31">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">local</span>(</span>
<span id="cb15-32">    {</span>
<span id="cb15-33">      terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">predict</span>(</span>
<span id="cb15-34">        raster,</span>
<span id="cb15-35">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(),</span>
<span id="cb15-36">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fun =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(model, newdata) {</span>
<span id="cb15-37">          out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(</span>
<span id="cb15-38">            formulas,</span>
<span id="cb15-39">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(calc) {</span>
<span id="cb15-40">              <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">with</span>(newdata, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">eval</span>(calc))</span>
<span id="cb15-41">            }</span>
<span id="cb15-42">          )</span>
<span id="cb15-43">          <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(out) <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> short_names</span>
<span id="cb15-44">          out</span>
<span id="cb15-45">        },</span>
<span id="cb15-46">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filename =</span> output_filename</span>
<span id="cb15-47">      )</span>
<span id="cb15-48">    },</span>
<span id="cb15-49">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">envir =</span> exec_env</span>
<span id="cb15-50">  )</span>
<span id="cb15-51">  </span>
<span id="cb15-52">  output_filename</span>
<span id="cb15-53">}</span></code></pre></div></div>
</div>
<p>This shouldn’t impact anything from the user’s perspective when calculating well-behaved formulas:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb16-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.time</span>({</span>
<span id="cb16-2">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb16-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rasters/example_sentinel1.tif"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>),</span>
<span id="cb16-4">    example_index,</span>
<span id="cb16-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb16-6">  )</span>
<span id="cb16-7">})</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>   user  system elapsed 
  0.016   0.000   0.015 </code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb18-1">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(out))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-27-minimal-environments/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>But it makes the most obvious malicious code fail:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb19-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb19-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb19-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rasters/example_sentinel1.tif"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>),</span>
<span id="cb19-4">    evil_index,</span>
<span id="cb19-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb19-6">  )</span>
<span id="cb19-7">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error in system("echo oh no &gt; /tmp/example.txt") : 
  could not find function "system"</code></pre>
</div>
</div>
<p><strong>Update 2023-10-27</strong>: However, we actually need to go one step further. Because we’ve included <code>::</code> in our minimal environment, we’ve left in an “escape hatch” that malicious code can use to access any functions it wants:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1">evil_index<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>formula <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"base::system('echo oh no &gt; /tmp/example2.txt')"</span></span>
<span id="cb21-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb21-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb21-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rasters/example_sentinel1.tif"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>),</span>
<span id="cb21-5">    evil_index,</span>
<span id="cb21-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb21-7">  )</span>
<span id="cb21-8">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error : [predict] the number of values returned by 'fun' (model predict function) does not match the input. Try na.rm=TRUE?</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb23-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">readLines</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/tmp/example2.txt"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "oh no"</code></pre>
</div>
</div>
<p>Instead of calling <code>terra::predict()</code> via <code>::</code> inside our environment, we’ll need to include that function in the environment directly, in order to remove this escape hatch:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb25-1">calculate_indices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(raster,</span>
<span id="cb25-2">                              indices,</span>
<span id="cb25-3">                              output_filename) {</span>
<span id="cb25-4">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> (<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inherits</span>(raster, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SpatRaster"</span>)) raster <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(raster)</span>
<span id="cb25-5">  formulas <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(indices[[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"formula"</span>]], str2lang)</span>
<span id="cb25-6">  </span>
<span id="cb25-7">  exec_env <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> rlang<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">new_environment</span>(</span>
<span id="cb25-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(</span>
<span id="cb25-9">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-10">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">(</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">(</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-11">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">*</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">*</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-12">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">/</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">/</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-13">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">^</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">^</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-14">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">+</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">+</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-15">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-16">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">{</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">{</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-17">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">names&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">names&lt;-</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-18">      <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">function</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span> <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span><span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">function</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">`</span>,</span>
<span id="cb25-19">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">list =</span> list,</span>
<span id="cb25-20">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">lapply =</span> lapply,</span>
<span id="cb25-21">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">with =</span> with,</span>
<span id="cb25-22">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">eval =</span> eval,</span>
<span id="cb25-23">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">formulas =</span> formulas,</span>
<span id="cb25-24">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">short_names =</span> indices[[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"short_name"</span>]],</span>
<span id="cb25-25">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">raster =</span> raster,</span>
<span id="cb25-26">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">output_filename =</span> output_filename,</span>
<span id="cb25-27">      <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">predict =</span> terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>predict</span>
<span id="cb25-28">    )</span>
<span id="cb25-29">  )</span>
<span id="cb25-30">  </span>
<span id="cb25-31">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">local</span>(</span>
<span id="cb25-32">    {</span>
<span id="cb25-33">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">predict</span>(</span>
<span id="cb25-34">        raster,</span>
<span id="cb25-35">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">list</span>(),</span>
<span id="cb25-36">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fun =</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(model, newdata) {</span>
<span id="cb25-37">          out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">lapply</span>(</span>
<span id="cb25-38">            formulas,</span>
<span id="cb25-39">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(calc) {</span>
<span id="cb25-40">              <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">with</span>(newdata, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">eval</span>(calc))</span>
<span id="cb25-41">            }</span>
<span id="cb25-42">          )</span>
<span id="cb25-43">          <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(out) <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> short_names</span>
<span id="cb25-44">          out</span>
<span id="cb25-45">        },</span>
<span id="cb25-46">        <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filename =</span> output_filename</span>
<span id="cb25-47">      )</span>
<span id="cb25-48">    },</span>
<span id="cb25-49">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">envir =</span> exec_env</span>
<span id="cb25-50">  )</span>
<span id="cb25-51">  </span>
<span id="cb25-52">  output_filename</span>
<span id="cb25-53">}</span></code></pre></div></div>
</div>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb26" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb26-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb26-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb26-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">system.file</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rasters/example_sentinel1.tif"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">package =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"rsi"</span>),</span>
<span id="cb26-4">    evil_index,</span>
<span id="cb26-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tempfile</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fileext =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">".tif"</span>)</span>
<span id="cb26-6">  )</span>
<span id="cb26-7">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error in base::system : could not find function "::"</code></pre>
</div>
</div>
<p>Thanks Gábor Csárdi for the catch!</p>
<p>This still isn’t a perfect fix – and rsi also checks to make sure that all of your formulas are only using symbols that match the band names of your rasters. Even with these checks, you should investigate the formulas you’re going to run before you actually run them – or save a copy of the trusted indices you’re going to calculate and provide <em>those</em> to <code>calculate_indices()</code>, rather than using <code>spectral_indices()</code> directly. But this hopefully makes this function a pinch safer.</p>
<p>I wrote this piece of rsi back in August and then more or less didn’t think of it again until yesterday, when we released rsi on GitHub – and at the same time, started using GitHub Actions for CI for the package. All of a sudden, I started seeing a lot of CI runs that looked like this:</p>
<p><img src="https://mm218.dev/posts/2023-10-27-minimal-environments/gh.png" class="img-fluid" alt="A screenshot of GitHub Actions, where all workflows -- including R CMD check -- are successful, but test coverage is failing."></p>
<p>R CMD check succeeding and test coverage failing made no sense to me, as theoretically they’re both running a full check and reporting the results. In classic developer tradition, I spent a few hours flailing around before finally giving up and resorting to the final option I had available: reading the error messages.</p>
<p>And it turned out that each of those failed test coverage runs had the same error message:</p>
<p><img src="https://mm218.dev/posts/2023-10-27-minimal-environments/err.png" class="img-fluid" alt="Expected `... <- NULL` to run without any errors. Actually got a <simpleError> with text: could not find function :::"></p>
<p><code>could not find function ":::"</code>.</p>
<p>My test coverage workflow is using the (fantastic) <a href="https://covr.r-lib.org/">covr</a> package to measure line coverage. Behind the scenes, covr is doing a lot more than just running R CMD check, like my other workflows – covr is actually <a href="https://covr.r-lib.org/articles/how_it_works.html">changing what R runs when it runs your code</a>, in order to measure how many times any given line of code gets called. This is a really neat workflow, but it doesn’t play nicely with our minimal environment here; the new code added inside of our <code>local()</code> statement depends upon functions that we didn’t (and aren’t going to) provide to the minimal environment, such as <code>:::</code>. Adding a new environment variable to the test coverage workflow, and skipping tests that ran the <code>local()</code> call when that environment variable was defined, wound up solving the issue.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>Which rsi will automatically enforce when using <code>get_*_imagery()</code> functions.↩︎</p></li>
<li id="fn2"><p>Or any other environments this one inherits from.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Package development</category>
  <guid>https://mm218.dev/posts/2023-10-27-minimal-environments/</guid>
  <pubDate>Fri, 27 Oct 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-10-27-minimal-environments/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Introducing rsi</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-10-26-rsi/</link>
  <description><![CDATA[ 





<p>I am so, so excited to share that rsi,<sup>1</sup> a new R package for handling common spatial data wrangling tasks, is <a href="https://github.com/Permian-Global-Research/rsi">now available on GitHub</a>! Specifically, rsi handles:</p>
<ul>
<li>Downloading data from STAC APIs (using some of the tricks I wrote about in <a href="https://stacspec.org/en/tutorials/1-download-data-using-r/">the STAC R tutorials</a>),</li>
<li>Computing indices from the <a href="https://github.com/awesome-spectral-indices/awesome-spectral-indices">Awesome Spectral Indices</a> project using that imagery,</li>
<li>And a handful of other spatial data wrangling problems, including <a href="https://permian-global-research.github.io/rsi/reference/stack_rasters.html">merging multiple bands into a single VRT file</a>.</li>
</ul>
<p>Most of my work on the package happened while I was at <a href="https://permianglobal.com/">Permian Global</a>, helping them automate their MMRV pipelines used to make sure their carbon credit projects are actually preserving carbon sinks and have additive benefits over time.<sup>2</sup> I’m really excited and grateful that Permian has agreed to open-source this work.</p>
<p>Let’s take a whirlwind tour through the features in this package! For the purposes of this blog post, let’s download and process imagery for Middlesex county in Massachusetts, USA.<sup>3</sup> We’ll use the tigris package to get the borders for this county, and then reproject it into the Massachusetts State Plane:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">ma_counties <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> tigris<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">counties</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MA"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">progress_bar =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>)</span>
<span id="cb1-2">middlesex <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> ma_counties[ma_counties<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>NAME <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Middlesex"</span>, ]</span>
<span id="cb1-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Reprojecting into the MA state plane, a planar CRS:</span></span>
<span id="cb1-4">middlesex <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_transform</span>(middlesex, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">26986</span>)</span>
<span id="cb1-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_geometry</span>(middlesex))</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-26-rsi/index_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This is the area we’re going to download and process data for.</p>
<p>Specifically, let’s start off by downloading Landsat imagery from Microsoft’s Planetary Computer STAC API! This is pretty straightforward using rsi: use the <code>get_landsat_imagery()</code> function with an area of interest and a timeframe, and you’ll automatically get a cloud-masked composite image of all acquisitions for that spatiotemporal window. Let’s grab all the imagery from September 2023:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(rsi)</span>
<span id="cb2-2">future<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plan</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"multisession"</span>)</span>
<span id="cb2-3"></span>
<span id="cb2-4">middlesex_imagery <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_landsat_imagery</span>(</span>
<span id="cb2-5">  middlesex,</span>
<span id="cb2-6">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2023-09-01"</span>,</span>
<span id="cb2-7">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2023-09-30"</span></span>
<span id="cb2-8">)</span>
<span id="cb2-9">middlesex_imagery</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "shell_liquid_doctor.tif"</code></pre>
</div>
</div>
<p>Note that I’ve used <code>future::plan()</code> here to specify a parallelization methodology, as the data retrieval functions in rsi are all compatible with future<sup>4</sup> to speed up downloads by using multiple threads. These functions also use progressr to let users specify progress reporting methods, if they want them, by calling <code>progressr::handlers()</code>.</p>
<p>By default, <code>get_landsat_imagery()</code> will download a composite of all bands available in Landsat 8 and 9 imagery for our timeframe:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(middlesex_imagery) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb4-2">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-26-rsi/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Notice that this composite has been cloud-masked (using the QA pixel band) and rescaled<sup>5</sup> (using the scale and offset specified in metadata provided by the STAC endpoint) automatically. You can control these behaviors via function arguments.</p>
<p>We’re able to download more than just imagery via rsi functions – for instance, we could also grab a DEM for this area from Planetary Computer, using the <code>get_dem()</code> function:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">middlesex_dem <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_dem</span>(middlesex)</span>
<span id="cb5-2">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(middlesex_dem) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb5-3">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-26-rsi/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Under the hood, both of these functions (and their friends, <code>get_sentinel2_data()</code> and <code>get_sentinel1_data()</code>) are powered by a lower-level <code>get_stac_data()</code> function, which should theoretically work with any imagery provided by any STAC API, anywhere. These functions simply provide user-friendly defaults to make it faster to get the data you care about.</p>
<p>In addition to these STAC-focused data-downloading functions, rsi also has an interface to the <a href="https://github.com/awesome-spectral-indices/awesome-spectral-indices">Awesome Spectral Indices</a> project, via the <code>spectral_indices()</code> function:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spectral_indices</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb6-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 6 × 9
  application_domain bands     contributor    date_of_addition formula long_name
  &lt;chr&gt;              &lt;list&gt;    &lt;chr&gt;          &lt;chr&gt;            &lt;chr&gt;   &lt;chr&gt;    
1 vegetation         &lt;chr [2]&gt; https://githu… 2021-11-17       (N - 0… Aerosol …
2 vegetation         &lt;chr [2]&gt; https://githu… 2021-11-17       (N - 0… Aerosol …
3 water              &lt;chr [6]&gt; https://githu… 2022-09-22       (B + G… Augmente…
4 vegetation         &lt;chr [2]&gt; https://githu… 2021-09-20       (1 / G… Anthocya…
5 vegetation         &lt;chr [3]&gt; https://githu… 2022-04-08       N * ((… Anthocya…
6 vegetation         &lt;chr [4]&gt; https://githu… 2021-05-11       (N - (… Atmosphe…
# ℹ 3 more variables: platforms &lt;list&gt;, reference &lt;chr&gt;, short_name &lt;chr&gt;</code></pre>
</div>
</div>
<p>This function attempts to grab the newest version of the spectral indices JSON file from the ASI repo, and then stores that data in a cache folder on your computer. If the downloading fails, the package will fall back (with a warning) to use your possibly outdated cache instead; if you don’t have a cache and can’t download the files, the package will instead (with a different warning) resort to using a packaged version of the indices file. This ensures that you’re always getting the latest and greatest version of the ASI list possible, but that the package can still be used without an internet connection.</p>
<p>There are also functions in rsi to sort through the ASI list of indices. For instance, the <code>filter_platforms()</code> function can be used to, well, filter the list to only indices that can be calculated from a given platform. For instance, to filter to only indices that can be calculated using data from Landsat’s Operational Land Imager:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter_platforms</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">platforms =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Landsat-OLI"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 6 × 9
  application_domain bands     contributor    date_of_addition formula long_name
  &lt;chr&gt;              &lt;list&gt;    &lt;chr&gt;          &lt;chr&gt;            &lt;chr&gt;   &lt;chr&gt;    
1 vegetation         &lt;chr [2]&gt; https://githu… 2021-11-17       (N - 0… Aerosol …
2 vegetation         &lt;chr [2]&gt; https://githu… 2021-11-17       (N - 0… Aerosol …
3 water              &lt;chr [6]&gt; https://githu… 2022-09-22       (B + G… Augmente…
4 vegetation         &lt;chr [4]&gt; https://githu… 2021-05-11       (N - (… Atmosphe…
5 vegetation         &lt;chr [4]&gt; https://githu… 2021-05-14       sla * … Adjusted…
6 vegetation         &lt;chr [2]&gt; https://githu… 2022-04-08       (N * (… Advanced…
# ℹ 3 more variables: platforms &lt;list&gt;, reference &lt;chr&gt;, short_name &lt;chr&gt;</code></pre>
</div>
</div>
<p>There’s an equivalent function to filter indices based upon the bands that require. For instance, we can filter the list to only indices that use the red and blue band of images:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter_bands</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bands =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"R"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"B"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb10-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 2 × 9
  application_domain bands     contributor    date_of_addition formula long_name
  &lt;chr&gt;              &lt;list&gt;    &lt;chr&gt;          &lt;chr&gt;            &lt;chr&gt;   &lt;chr&gt;    
1 vegetation         &lt;chr [2]&gt; https://githu… 2022-04-08       (R - B… Kawashim…
2 vegetation         &lt;chr [2]&gt; https://githu… 2022-04-08       (R^2.0… Modified…
# ℹ 3 more variables: platforms &lt;list&gt;, reference &lt;chr&gt;, short_name &lt;chr&gt;</code></pre>
</div>
</div>
<p>Arguments to these functions let you control whether you’re looking for indices that match <em>all</em> platforms and bands you’ve specified, or <em>any</em> of them.</p>
<p>But rsi doesn’t simply make these formulas available in R, it also helps you compute these indices from imagery, via the <code>calculate_indices()</code> function. This function takes your imagery and a subset of <code>spectral_indices()</code> as arguments, and creates a raster containing all of those indices as an output. We can use <code>filter_bands()</code> to quickly get a list of the indices we can compute from our Landsat imagery, and then <code>calculate_indices()</code> to compute all 128 of those indices from our images:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">middlesex_indices <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb12-2">  middlesex_imagery,</span>
<span id="cb12-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter_bands</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">bands =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(middlesex_imagery))),</span>
<span id="cb12-4">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"middlesex_indices.tif"</span></span>
<span id="cb12-5">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>
|---------|---------|---------|---------|
=========================================
                                          </code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(middlesex_indices) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb14-2">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-26-rsi/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Note that <code>calculate_indices()</code> is evaluating the formulas in the <code>formula</code> column of the spectral indices data frame as if they were code.<sup>6</sup> These formulas are evaluated inside of a very limited environment, which doesn’t have access to the global environment or most R fixtures, which does <em>reduce</em> the amount of harm malicious code could do:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1">evil_index <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spectral_indices</span>()[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, ]</span>
<span id="cb15-2">evil_index<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>formula <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"system('echo OHNO')"</span></span>
<span id="cb15-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">try</span>(</span>
<span id="cb15-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">calculate_indices</span>(</span>
<span id="cb15-5">    middlesex_imagery,</span>
<span id="cb15-6">    evil_index,</span>
<span id="cb15-7">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"test.tif"</span></span>
<span id="cb15-8">  )</span>
<span id="cb15-9">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Error in system("echo OHNO") : could not find function "system"</code></pre>
</div>
</div>
<p>But it’s worth scanning your formulas before running <code>calculate_indices()</code>, just to make sure you aren’t going to be accidentally running something surprising!</p>
<p>Last but not least, rsi also provides a way to combine disparate data sets covering the same geographic region into a single VRT, quickly creating a file that you can treat as a single raster without taking up much additional storage space. This is a great way to create predictor bricks from your indices and downloaded data<sup>7</sup> which you can then use for model fitting and prediction:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1">combined_layers <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">stack_rasters</span>(</span>
<span id="cb17-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(middlesex_imagery, middlesex_dem, middlesex_indices),</span>
<span id="cb17-3">  <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"middlesex.vrt"</span></span>
<span id="cb17-4">)</span>
<span id="cb17-5"></span>
<span id="cb17-6">terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rast</span>(combined_layers) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb17-7">  terra<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-10-26-rsi/index_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>I’m <em>extremely</em> excited for this package to be out in the open, and for people to start using it. If you find the package useful or interesting, <a href="https://github.com/Permian-Global-Research/rsi/">drop us a star on GitHub</a> – and if you have any questions/comments/concerns about how it does things, please open an issue or a PR! We’re planning a CRAN release in the not-too-distant future, and would love to incorporate any feedback we get into that first released version.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>The name is a nonsense acronym. Initially the package was going to be “rsi: An R interface to the Rsome Spectral Indices project”, but the scope very quickly outgrew that; now it’s a convenient short package name that’s not taken on CRAN, and I’m backronyming as many additional meanings into those three letters as I can make fit.↩︎</p></li>
<li id="fn2"><p>I am not currently employed by Permian, and nothing I say here or elsewhere reflects their opinions! That said, Permian is a copyright holder and funder to rsi, just as Posit is to waywiser, given their support of the initial development of the package.↩︎</p></li>
<li id="fn3"><p>I was originally going to use Boston’s Suffolk county, but Suffolk county’s borders are <em>ludicrous</em>, because they include not just Boston’s mainland but also Boston’s islands, meaning the county is a normal-enough shape with a huge rectangle extending into the Atlantic off to the East.↩︎</p></li>
<li id="fn4"><p>Via future.apply, to be specific.↩︎</p></li>
<li id="fn5"><p>Explaining <a href="https://www.mm218.dev/posts/2023-08-24-landsat-scaling/">this blog post from a few months ago</a>.↩︎</p></li>
<li id="fn6"><p>Explaining <a href="https://www.mm218.dev/posts/2023-10-24-fun-r-funcs/#str2lang">this blog post from two days ago</a>.↩︎</p></li>
<li id="fn7"><p>Not that you should be regressing using raw imagery bands, but this lets you combine a DEM and other computed metrics with calculated indices.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <category>R packages</category>
  <guid>https://mm218.dev/posts/2023-10-26-rsi/</guid>
  <pubDate>Thu, 26 Oct 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-10-26-rsi/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Three fun R functions</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-10-24-fun-r-funcs/</link>
  <description><![CDATA[ 





<p>Inspired by <a href="https://masalmon.eu/2023/10/20/three-neat-functions/">Maëlle</a> who was inspired by <a href="https://yihui.org/en/2023/10/three-functions/">Yihui</a> who was inspired by <a href="https://masalmon.eu/2023/09/29/three-functions/">Maëlle</a>(who has <a href="https://masalmon.eu/2023/08/31/three-shorten/">a whole</a> <a href="https://masalmon.eu/2023/06/06/basic-patterns/">series</a> <a href="https://masalmon.eu/2023/08/30/three-r-functions/">about</a> <a href="https://masalmon.eu/2023/07/24/basic-notions/">this</a>), I wanted to share three useful base R functions that I think maybe don’t get enough love. And inspired by <a href="https://masalmon.eu/2023/10/20/three-neat-functions/">Maëlle again</a>, my list here is actually four functions.</p>
<section id="sweep" class="level2">
<h2 class="anchored" data-anchor-id="sweep"><code>sweep()</code></h2>
<p>If you ever need to do math with matrices, then <code>sweep()</code> is going to be your best friend. Say for instance we want to center and scale each column in a matrix. This is a pretty straightforward operation – we need to calculate the mean and standard deviations for each column, subtract the column mean from each observation, and then divide those by the corresponding standard deviation.</p>
<p>We can use <code>apply</code> to get our means and standard deviations:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Generate some fake data in a 10x10 matrix:</span></span>
<span id="cb1-2">x <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nrow =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb1-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Calculate one mean and sd for each column of our matrix:</span></span>
<span id="cb1-4">col_means <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">apply</span>(x, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, mean)</span>
<span id="cb1-5">col_sds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">apply</span>(x, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, sd)</span></code></pre></div></div>
</div>
<p>The subtraction and division are a bit less straightforward. R’s base math operators will attempt to do element-wise operations, treating our vector as a one-column array and replicating as needed. That’s not what we want:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all.equal</span>(</span>
<span id="cb2-2">  (x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> col_means) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> col_sds,</span>
<span id="cb2-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale</span>(x)</span>
<span id="cb2-4">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "Attributes: &lt; Length mismatch: comparison on first 1 components &gt;"
[2] "Mean relative difference: 0.360556"                               </code></pre>
</div>
</div>
<p>We could replicate our vector ourself, in order to take advantage of these element-wise operations:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all.equal</span>(</span>
<span id="cb4-2">  ((x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(col_means, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">byrow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> </span>
<span id="cb4-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(col_sds, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">byrow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.vector</span>(),</span>
<span id="cb4-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale</span>(x) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.vector</span>()</span>
<span id="cb4-5">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>But that’s silly, especially if we were working with more observations.</p>
<p>Better instead is to use <code>sweep()</code> to perform some operation between each element of our vector and each column of the matrix:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Take every value in our matrix, and subtract its corresponding column mean:</span></span>
<span id="cb6-2">centered <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sweep</span>(</span>
<span id="cb6-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> x, </span>
<span id="cb6-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">MARGIN =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># just like in apply()</span></span>
<span id="cb6-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">STATS =</span> col_means, </span>
<span id="cb6-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">FUN =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"-"</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># "-" is the default argument -- we don't NEED to provide it here</span></span>
<span id="cb6-7">)</span></code></pre></div></div>
</div>
<p>And we can similarly use <code>sweep()</code> to divide each column by its corresponding standard deviation, finishing up our centering and scaling:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Divide each value by its corresponding column sd:</span></span>
<span id="cb7-2">centered_and_scaled <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sweep</span>(centered, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, col_sds, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"/"</span>)</span>
<span id="cb7-3"></span>
<span id="cb7-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Works out identically to the built-in scale function:</span></span>
<span id="cb7-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all.equal</span>(</span>
<span id="cb7-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.vector</span>(centered_and_scaled),</span>
<span id="cb7-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.vector</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale</span>(x))</span>
<span id="cb7-8">)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>This is the main way I use <code>sweep()</code>, but there’s no requirement you use it for math – it works just as well with non-mathematical functions or non-numeric matrices:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">letter_mat <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rep</span>(letters[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb9-2">letter_mat</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>     [,1] [,2] [,3] [,4] [,5]
[1,] "a"  "a"  "a"  "a"  "a" 
[2,] "b"  "b"  "b"  "b"  "b" 
[3,] "c"  "c"  "c"  "c"  "c" 
[4,] "d"  "d"  "d"  "d"  "d" 
[5,] "e"  "e"  "e"  "e"  "e" </code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sweep</span>(letter_mat, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, LETTERS[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>], paste0)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code> [1] "aA" "bA" "cA" "dA" "eA" "aB" "bB" "cB" "dB" "eB" "aC" "bC" "cC" "dC" "eC"
[16] "aD" "bD" "cD" "dD" "eD" "aE" "bE" "cE" "dE" "eE"</code></pre>
</div>
</div>
</section>
<section id="reformulate-and-df2formula" class="level2">
<h2 class="anchored" data-anchor-id="reformulate-and-df2formula"><code>reformulate()</code> and <code>DF2formula()</code></h2>
<p>The <code>reformulate()</code> function is a lifesaver if you’re trying to write long or complicated formulas, or multiple formulas generated by some other logic in your code.</p>
<p>The function is pretty straightforward. If you’re trying to make a formula <code>y ~ x + z</code>, provide your predictors as the first argument and your outcome as the second:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">reformulate</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"x"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"z"</span>), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>y ~ x + z</code></pre>
</div>
</div>
<p>The nice thing is that <code>reformulate</code> accepts vectors as inputs, making it easy to construct a vector of predictors and automatically turn them into a formula:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">reformulate</span>(letters, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"outcome"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>outcome ~ a + b + c + d + e + f + g + h + i + j + k + l + m + 
    n + o + p + q + r + s + t + u + v + w + x + y + z</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">reformulate</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(Orange), <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"age"</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>age ~ Tree + age + circumference</code></pre>
</div>
</div>
<p>And in particular, this is an excellent alternative to dropping a few columns in order to use <code>outcome ~ .</code> – instead, you can use <code>setdiff()</code> to exclude those columns from your formula:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb19-1">outcome_variable <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"age"</span></span>
<span id="cb19-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">reformulate</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">setdiff</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">names</span>(Orange), outcome_variable), outcome_variable)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>age ~ Tree + circumference</code></pre>
</div>
</div>
<p>Relatedly, the function <code>DF2formula()</code> will automatically turn the column names from a data frame into a formula. The first column will become the outcome variable, and the rest will be used as predictors:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb21" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">DF2formula</span>(Orange)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Tree ~ age + circumference</code></pre>
</div>
</div>
<p>To change what column is used as the outcome variable, reorder the columns in your data frame:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb23-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">DF2formula</span>(Orange[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>circumference ~ age + Tree</code></pre>
</div>
</div>
</section>
<section id="str2lang" class="level2">
<h2 class="anchored" data-anchor-id="str2lang"><code>str2lang()</code></h2>
<p>Shockingly enough, <code>str2lang()</code> function turns a string into a language object:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb25" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb25-1">growth_rate <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"circumference / age"</span></span>
<span id="cb25-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str2lang</span>(growth_rate)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>circumference/age</code></pre>
</div>
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb27" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb27-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">class</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str2lang</span>(growth_rate))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] "call"</code></pre>
</div>
</div>
<p>Wooooo!</p>
<p>I think that, to most people, this does not sound immediately useful.<sup>1</sup> But the idea that your code can turn plain text into code at runtime is pretty powerful, and some of the most R-esque nonsense that R has to offer.</p>
<p>For instance, we can use <code>eval()</code> to actually execute the call created by <code>str2lang()</code> in our global environment:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb29" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb29-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">eval</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str2lang</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2 + 2"</span>))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 4</code></pre>
</div>
</div>
<p>And that string can do anything that regular R code can do – assign variables, manage connections, any procedure that normal R code can do:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb31" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb31-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">eval</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str2lang</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"x &lt;- 3"</span>))</span>
<span id="cb31-2">x</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 3</code></pre>
</div>
</div>
<p>We can also use this with <code>with()</code> or <code>local()</code> to execute our code inside of other environments. For instance, if we want to calculate our <code>growth_rate</code> from earlier, we can run that code with the <code>Orange</code> data frame:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb33" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb33-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">with</span>(Orange, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">eval</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str2lang</span>(growth_rate)))</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code> [1] 0.25423729 0.11983471 0.13102410 0.11454183 0.09748172 0.10349854
 [7] 0.09165613 0.27966102 0.14256198 0.16716867 0.15537849 0.13972380
[13] 0.14795918 0.12831858 0.25423729 0.10537190 0.11295181 0.10756972
[19] 0.09341998 0.10131195 0.08849558 0.27118644 0.12809917 0.16867470
[25] 0.16633466 0.14541024 0.15233236 0.13527181 0.25423729 0.10123967
[31] 0.12198795 0.12450199 0.11535337 0.12682216 0.11188369</code></pre>
</div>
</div>
<p>This can be a powerful way to “import” code from other sources, for instance if you have a CSV of equations you want to run against a data frame. You want to be careful when using this with untrusted inputs, of course – if your input includes a call to <code>system()</code>, it might wind up wrecking your computer!</p>


</section>


<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>I think, to most people, this barely sounds like English.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <guid>https://mm218.dev/posts/2023-10-24-fun-r-funcs/</guid>
  <pubDate>Tue, 24 Oct 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-10-24-fun-r-funcs/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Cloud-Native Geospatial If You Don’t Speak Snake</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-09-20-cngf/</link>
  <description><![CDATA[ 





<p>I’m over in the <a href="https://cloudnativegeo.org/blog/2023/09/cloud-native-geospatial-if-you-dont-speak-snake">Cloud-Native Geospatial Foundation’s blog today</a> with a post called “Cloud-Native Geospatial If You Don’t Speak Snake”. This post comes from an UnConference session at the 2023 ESIP July Meeting, organized by <a href="https://github.com/ashiklom">Alexey Shiklomanov</a>, focusing on what tools and what gaps exist in the cloud-native geospatial workflow for non-Python users. It was an absolutely fantastic UnConference session, and I’m hoping that my summary of our takeaways is as useful for others as it was for me.</p>
<p>One challenge in the session – and one that I don’t get into in the post – is that “cloud-native” means a lot of different things to different folks! The CNGF themselves have a pretty straightforward definition:</p>
<blockquote class="blockquote">
<p>Cloud-Native data formats are structured to be efficiently retrived from cloud object storage services which are designed to serve large volumes of data using generic RESTful / HTTP data transfer protocols.</p>
</blockquote>
<p>But our UnConference session was interested in a much broader definition, one that was a bit closer to the CNCF definition of the term:</p>
<blockquote class="blockquote">
<p>Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.</p>
</blockquote>
<p>My post focuses a bit more on tools and gaps relating to the CNGF definition – it’s on their blog, after all! – but our working group was interested in API specifications, serverless runtimes, and all sorts of other topics that I think haven’t seen the same amount of standardization as the data format and retrieval side of geospatial workflows.<sup>1</sup> There’s a lot of activity around geospatial workflows right now, and I’m excited to see what the future has in store!</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>Most of my data storage and retrieval work boils down to “use COGs, use GDAL’s virtual filesystem interface”. I <em>wish</em> my analysis, runtime, and codebase setups could be boiled down to that level.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Spatial</category>
  <category>geospatial data</category>
  <guid>https://mm218.dev/posts/2023-09-20-cngf/</guid>
  <pubDate>Wed, 20 Sep 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-09-20-cngf/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Pre-allocating vectors is for nerds</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-08-29-allocations/</link>
  <description><![CDATA[ 





<p>The second circle of R hell, in <a href="https://www.burns-stat.com/pages/Tutor/R_inferno.pdf">Patrick Burns’ seminal book The R Inferno</a>, is titled “Growing Objects”. This refers to a common antipattern for R users, usually among the first things taught when dealing with iteration: it is extremely inefficient to grow a vector using <code>c()</code>, like so:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">vector_c <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(n) {</span>
<span id="cb1-2">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>()</span>
<span id="cb1-3">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (i <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n) {</span>
<span id="cb1-4">    out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(out, i)</span>
<span id="cb1-5">  }</span>
<span id="cb1-6">  out</span>
<span id="cb1-7">}</span></code></pre></div></div>
</div>
<p>Instead, Burns says, it is better to pre-allocate our vector <code>out</code>, and assign our function’s output to a specific position in <code>out</code> using either <code>[</code> or <code>[[</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">vector_prealloc_one_bracket <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(n) {</span>
<span id="cb2-2">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"numeric"</span>, n)</span>
<span id="cb2-3">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (i <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n) {</span>
<span id="cb2-4">    out[i] <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> i</span>
<span id="cb2-5">  }</span>
<span id="cb2-6">  out</span>
<span id="cb2-7">}</span>
<span id="cb2-8"></span>
<span id="cb2-9">vector_prealloc_two_bracket <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(n) {</span>
<span id="cb2-10">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"numeric"</span>, n)</span>
<span id="cb2-11">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (i <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n) {</span>
<span id="cb2-12">    out[[i]] <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> i</span>
<span id="cb2-13">  }</span>
<span id="cb2-14">  out</span>
<span id="cb2-15">}</span></code></pre></div></div>
</div>
<p>Of course, it would be better yet to avoid our loop entirely, and simply create our final object using the colon operator:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">colon_operator <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(n) {</span>
<span id="cb3-2">  <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n</span>
<span id="cb3-3">}</span></code></pre></div></div>
</div>
<p>But that’s beside the point right now.</p>
<p>This advice was originally written in 2011, but is even more important today. In Burns’ book, subsetting is roughly 7 times faster when <code>n</code> is 10,000; on my computer today, subsetting is roughly 200 times faster:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">n <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span></span>
<span id="cb4-2">bench<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mark</span>(</span>
<span id="cb4-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">c =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_c</span>(n),</span>
<span id="cb4-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">one_bracket =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_prealloc_one_bracket</span>(n),</span>
<span id="cb4-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">two_brackets =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_prealloc_two_bracket</span>(n),</span>
<span id="cb4-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colon =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">colon_operator</span>(n),</span>
<span id="cb4-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filter_gc =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span></span>
<span id="cb4-8">)[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"expression"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"median"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"itr/sec"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mem_alloc"</span>)]</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 4 × 4
  expression     median `itr/sec` mem_alloc
  &lt;bch:expr&gt;   &lt;bch:tm&gt;     &lt;dbl&gt; &lt;bch:byt&gt;
1 c                51ms      19.3   191.2MB
2 one_bracket     277µs    3548.     99.1KB
3 two_brackets    276µs    3538.     96.7KB
4 colon           361ns 2124339.         0B</code></pre>
</div>
</div>
<p>But what if <code>n</code> is unknowable? Well, to quote Burns:</p>
<blockquote class="blockquote">
<p>Often a reasonable upper bound on the size of the final object is known. If so, then create the object with that size and then remove the extra values at the end. If the final size is a mystery, then you can still follow the same scheme, but allow for periodic growth of the object.</p>
</blockquote>
<p>This is still probably a decent approach: over-allocate and trim down, or allocate in chunks and only grow when those chunks are exhausted.</p>
<p>Or… perhaps we might try growing a vector with <code>[</code> or <code>[[</code>, rather than with <code>c()</code>? To anyone raised on R traditions, this might seem like a code smell:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">vector_unalloc_one_bracket <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(n) {</span>
<span id="cb6-2">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>()</span>
<span id="cb6-3">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (i <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n) {</span>
<span id="cb6-4">    out[i] <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> i</span>
<span id="cb6-5">  }</span>
<span id="cb6-6">  out</span>
<span id="cb6-7">}</span>
<span id="cb6-8"></span>
<span id="cb6-9">vector_unalloc_two_bracket <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(n) {</span>
<span id="cb6-10">  out <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>()</span>
<span id="cb6-11">  <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> (i <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n) {</span>
<span id="cb6-12">    out[[i]] <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> i</span>
<span id="cb6-13">  }</span>
<span id="cb6-14">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unlist</span>(out)</span>
<span id="cb6-15">}</span></code></pre></div></div>
</div>
<p>But if we test it out:<sup>1</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">bench<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mark</span>(</span>
<span id="cb7-2">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">c =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_c</span>(n),</span>
<span id="cb7-3">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prealloc_one_bracket =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_prealloc_one_bracket</span>(n),</span>
<span id="cb7-4">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">unalloc_one_bracket =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_unalloc_one_bracket</span>(n),</span>
<span id="cb7-5">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">unalloc_two_brackets =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_unalloc_two_bracket</span>(n),</span>
<span id="cb7-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filter_gc =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span></span>
<span id="cb7-7">)[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"expression"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"median"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"itr/sec"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mem_alloc"</span>)]</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 4 × 4
  expression             median `itr/sec` mem_alloc
  &lt;bch:expr&gt;           &lt;bch:tm&gt;     &lt;dbl&gt; &lt;bch:byt&gt;
1 c                     54.02ms      16.6  191.23MB
2 prealloc_one_bracket 285.52µs    3428.    78.17KB
3 unalloc_one_bracket    1.24ms     710.   871.73KB
4 unalloc_two_brackets   2.76ms     337.     1.72MB</code></pre>
</div>
</div>
<p>Growing a vector via <code>[</code> is still notably slower than assigning values to a pre-allocated vector; it looks like it’s roughly ~5 times slower. But that still means it’s ~50 times faster than growing a vector via <code>c()</code>, and allocates ~200 times less memory to do so. Growing a vector via <code>[[</code> isn’t quite as efficient – taking roughly twice the time and memory as <code>[</code> here – but still blows <code>c()</code> out of the water.</p>
<p>That’s not too shabby, for a code smell. How does a method like <code>vapply()</code> compare?</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">vapply_lambda <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(n) {</span>
<span id="cb9-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n, \(i) i, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">numeric</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb9-3">}</span>
<span id="cb9-4"></span>
<span id="cb9-5">bench<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mark</span>(</span>
<span id="cb9-6">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">c =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_c</span>(n),</span>
<span id="cb9-7">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prealloc_one_bracket =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_prealloc_one_bracket</span>(n),</span>
<span id="cb9-8">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">unalloc_one_bracket =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_unalloc_one_bracket</span>(n),</span>
<span id="cb9-9">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">unalloc_two_brackets =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_unalloc_two_bracket</span>(n),</span>
<span id="cb9-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">vapply =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply_lambda</span>(n),</span>
<span id="cb9-11">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filter_gc =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span></span>
<span id="cb9-12">)[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"expression"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"median"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"itr/sec"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"mem_alloc"</span>)]</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 5 × 4
  expression             median `itr/sec` mem_alloc
  &lt;bch:expr&gt;           &lt;bch:tm&gt;     &lt;dbl&gt; &lt;bch:byt&gt;
1 c                     50.87ms      19.5   191.2MB
2 prealloc_one_bracket 279.79µs    3501.     78.2KB
3 unalloc_one_bracket    1.18ms     649.      853KB
4 unalloc_two_brackets   2.69ms     345.      1.7MB
5 vapply                 3.41ms     272.     78.2KB</code></pre>
</div>
</div>
<p><code>vapply()</code> uses as little memory as our pre-allocation approaches, but is slower than either of our un-allocated methods.<sup>2</sup></p>
<p>It’s worth emphasizing that the differences between these methods are <em>microscopic</em> compared to the difference between them and <code>c()</code> for growing vectors:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">benchmarks <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> bench<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">press</span>(</span>
<span id="cb11-2">  bench<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mark</span>(</span>
<span id="cb11-3">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">c =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_c</span>(n),</span>
<span id="cb11-4">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prealloc_one_bracket =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_prealloc_one_bracket</span>(n),</span>
<span id="cb11-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">unalloc_one_bracket =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_unalloc_one_bracket</span>(n),</span>
<span id="cb11-6">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">unalloc_two_brackets =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vector_unalloc_two_bracket</span>(n),</span>
<span id="cb11-7">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">vapply =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply_lambda</span>(n),</span>
<span id="cb11-8">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filter_gc =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span></span>
<span id="cb11-9">  ),</span>
<span id="cb11-10">  <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">n =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100000</span>)</span>
<span id="cb11-11">)</span>
<span id="cb11-12"></span>
<span id="cb11-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb11-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(benchmarks, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(n, median, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.character</span>(expression))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb11-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb11-16">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb11-17">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Median execution time (s)"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-29-allocations/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>But as far as execution speed goes, well, maybe growing objects in general isn’t worthy of its own circle of hell anymore:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">benchmarks[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.character</span>(benchmarks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>expression) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"c"</span>, ] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb12-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(n, median, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.character</span>(expression))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb12-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb12-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb12-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Median execution time (s)"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-29-allocations/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Though of course, <code>vapply()</code> and the pre-allocated methods still win out in terms of memory allocation:<sup>3</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1">benchmarks[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.character</span>(benchmarks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>expression) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"c"</span>, ] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb13-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(n, mem_alloc, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.character</span>(expression))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb13-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb13-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_minimal</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb13-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">labs</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Memory allocation (bytes)"</span>)</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-29-allocations/index_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>So: pre-allocate your vectors when you’re able. But maybe it’s fine to grow an object every once in a while, as a treat. It probably won’t get you sent to hell.</p>
<p>I have no idea when things changed to make growing vectors via <code>[</code> so much more efficient now than in 2011 – and please let me know in the comments/<a href="https://fosstodon.org/@MikeMahoney218">Mastodon</a>/<a href="https://bsky.app/profile/mikemahoney218.com">BlueSky</a> if you know any more details here.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>I dropped <code>prealloc_two_brackets</code> from the benchmarks because it was performing ~the same as the one-bracket alternative.↩︎</p></li>
<li id="fn2"><p>Usual disclaimer that this is probably not a type of slowness that matters for your code, that you should look into moving computation to C++/Rust if you care about a few milliseconds execution time, and that the real benefits of *apply functions come from readability and their potential for parallelization, not speed.↩︎</p></li>
<li id="fn3"><p>The pre-allocated line is hidden by the <code>vapply()</code> line; they’re practically identical, and possibly also literally identical.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <category>Package development</category>
  <guid>https://mm218.dev/posts/2023-08-29-allocations/</guid>
  <pubDate>Tue, 29 Aug 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-08-29-allocations/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>Yes, you can rescale Landsat images after compositing them</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-08-24-landsat-scaling/</link>
  <description><![CDATA[ 





<p>Landsat data are distributed as unsigned 16-bit images, which need to be rescaled to get raw band values. The <a href="https://www.usgs.gov/faqs/how-do-i-use-a-scale-factor-landsat-level-2-science-products">rescaling formulas are dependent upon the band type and collection used</a>, but for current Collection 2 data boil down to two equations:</p>
<ul>
<li>For surface reflectance data, the formula is <img src="https://latex.codecogs.com/png.latex?X%20*%200.0000275%20-%200.2"> (where <img src="https://latex.codecogs.com/png.latex?X"> is the scaled band value)</li>
<li>For surface temperature data, the formula is <img src="https://latex.codecogs.com/png.latex?X%20*%200.00341802%20+%20149.0">.</li>
</ul>
<p>For my use-cases, I’m only rarely looking to download (and rescale) a single Landsat image. More often, I want to take all the Landsat images from a given timeframe (for instance, the growing season in the area I care about) and combine them into a composite image, taking the mean or median pixel value for that time period.</p>
<p>And every single time I go to do this, I need to figure out for the umpteenth time whether I can composite the images and <em>then</em> rescale them, or whether I need to rescale each individual image before making my composite. This is analytically solvable, and I think is pretty straightforward to solve – and, to spoil the rest of this post, the answer is that it doesn’t matter when you rescale. But I can never remember that, and I never trust my algebra when I try to prove that you can rescale before or after compositing either.</p>
<p>But this is an easy thing to simulate – just make a bunch of replications of compositing some number of 16-bit values, rescaling either before or after making the composite, and test for equality. That’s do-able in a few lines of R:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1">landsat_rescale <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(x) x <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0000275</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span></span>
<span id="cb1-2"></span>
<span id="cb1-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vapply</span>(</span>
<span id="cb1-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(mean, median),</span>
<span id="cb1-5">  \(f) {</span>
<span id="cb1-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">replicate</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>, {</span>
<span id="cb1-7">      values <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">replicate</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample.int</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">65455</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb1-8">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all.equal</span>(</span>
<span id="cb1-9">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">landsat_rescale</span>(values)),</span>
<span id="cb1-10">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">landsat_rescale</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f</span>(values))</span>
<span id="cb1-11">      )</span>
<span id="cb1-12">    }) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-13">      <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all</span>()</span>
<span id="cb1-14">  },</span>
<span id="cb1-15">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logical</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-16">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb1-17">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">all</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] TRUE</code></pre>
</div>
</div>
<p>It’s fine! You can rescale before or after compositing, whether you’re using a mean or a median composite. Do whatever is easiest for your workflow. Go in peace.</p>



 ]]></description>
  <category>R</category>
  <category>geospatial data</category>
  <category>Tutorials</category>
  <guid>https://mm218.dev/posts/2023-08-24-landsat-scaling/</guid>
  <pubDate>Thu, 24 Aug 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-08-24-landsat-scaling/landsat.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>A long digression about the word ‘vector’</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-08-07-vector/</link>
  <description><![CDATA[ 





<p>In linguistics, there’s this concept called “semantic overload” that refers to when a word has more than one distinct meaning, and the appropriate meaning needs to be inferred from context. The classic example is when someone says they’re “running to the store”: we can guess from context that the speaker isn’t going for a jog, but we’re forced to guess.</p>
<p>Software engineering loves semantic overload. An “agile team” might be a vague way to say that you’re very responsive, or it might mean you work in tightly-defined two week sprints. A “transaction” might be a customer ordering from a website, or a database writing a new row. When you’re developing software that’s tackling new types of problems, there’s often not existing language that describes exactly what your tool is trying to do, and so instead programmers use existing terms and rely upon metaphors and analogies to adapt them for a new purpose. But because the meaning of these terms changes depending on the context they’re used in, this overload can be a real barrier to learning for new users who don’t yet have the context to understand the overloaded term. For example, I’ve worked with a number of new programmers who were afraid of opening “Issues” on GitHub projects, because in other contexts announcing an issue you have with someone’s work is an aggressive action.<sup>1</sup> Lacking the shared context makes it hard to decode what these terms mean.</p>
<p>And so, the term “vector”. For a bit of context, my undergraduate degree was in ecology – forest ecosystem science specifically, a specialization chosen in order to not need to take courses in Calculus 2 or Organic Chemistry. Which meant that, coming out of my degree, I had one course in physics, a handful in GIS and spatial data, and none in computer science. So when I learned that the base unit of R data, the result of running code like:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1] 1 2</code></pre>
</div>
</div>
<p>Was called a “vector”, I was frustrated. At this point, I had been told that:</p>
<ul>
<li><p>In physics, a “vector” was any line with a magnitude (“length”) and a direction.</p></li>
<li><p>In GIS, a “vector” was pretty much any type of data; points, lines, polygons, whatever.<sup>2</sup></p></li>
<li><p>In R and apparently computer science, a “vector” was quite literally any data whatsoever.</p></li>
</ul>
<p>This felt to me like another case of pointless complexity, of the word “vector” being overloaded beyond the point of usefulness – and I didn’t feel like the word “vector” was particularly useful in the first place.</p>
<p>It turns out, though, that I was just missing the context that linked these three meanings together. I wouldn’t get that context until I was in grad school, procrastinating by watching <a href="https://www.3blue1brown.com/topics/linear-algebra">3Blue1Brown’s excellent videos on the fundamentals of linear algebra</a>. I’m not nearly qualified to teach anything about linear algebra, and I don’t have the space nor the inclination to try to do so here – but I’m going to try and share the thing that gave me an “aha!” moment. To do that, I need to start off by getting really abstract.</p>
<p>First off, let’s say we’ve got some 2-dimensional plane, that looks like this:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb3-2">df <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">data.frame</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb3-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(df, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(x, y)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb3-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>We’ve got our X axis and our Y axis here, and both meet at the origin – the place where the X coordinate is 0, and the Y coordinate is 0. We’d say that the coordinates at that point are (0, 0).</p>
<p>Now let’s go back to our physics definition of a vector – any line with a known length and direction. We could draw one of those in this coordinate plane – say our line is a bit longer than 70 “units”, going from (0, 0) all the way to (50, 50):</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(df, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(x, y)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb4-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"segment"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"point"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">shape =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"black"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This is how I was taught to think about vectors in those physics classes – arrows on some abstract plane. For instance, we could turn this into an acceleration vector by labeling these axes, so that our line is now charting speed over time:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(df, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(x, y)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb5-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Speed"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"segment"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"point"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">shape =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"black"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>If you change the axis labels to distance over time, you have a velocity vector instead:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(df, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(x, y)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb6-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"segment"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"point"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">shape =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"black"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>The meaning of the line is defined by the axis labels – by the actual coordinate plane your vector is in. Looking at our velocity vector, we can tell how far we’ve gone (position on the Y axis) for any given time (position on the X axis).</p>
<p>So if we use a different coordinate plane and relabel our axes to show “distance away from the origin”:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(df, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(x, y)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb7-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb7-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">annotate</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"segment"</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">xend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">yend =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb7-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>We get vectors in a spatial sense. Rather than showing distance at a given time, we’re now showing the position of our data – in this case, a linestring – in one direction when it’s at a given position in the other direction. Just like in physics, the actual meaning of this line depends on the coordinate plane – on the <em>coordinate reference system</em> of the data. The CRS of your data is a standardized way to define <em>where</em> your origin is, <em>what units</em> your distances are measured in, and <em>which direction</em> away from the origin you’re going.</p>
<p>And we can use spatial vector data to replace our physics vector. We just need to define a matrix containing the beginning and end coordinates of our line:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">our_matrix <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">nrow =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">byrow =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>)</span>
<span id="cb8-2">our_matrix</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>     [,1] [,2]
[1,]    0    0
[2,]   50   50</code></pre>
</div>
</div>
<p>And then we can tell sf that it should understand that matrix as being a line:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">our_line <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_linestring</span>(our_matrix)</span>
<span id="cb10-2">our_line</span></code></pre></div></div>
<div class="cell-output cell-output-stderr">
<pre><code>LINESTRING (0 0, 50 50)</code></pre>
</div>
</div>
<p>And voila, we have a spatial vector:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb12-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb12-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb12-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(</span>
<span id="cb12-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> our_line</span>
<span id="cb12-6">  )</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>At each point along this line on our X axis, our line is a single, known position on the Y axis. If we only have one measurement of XY position – say, a single GPS measurement – then our line would be of length 0. We’d have a point instead:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb13-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb13-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb13-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_sf</span>(</span>
<span id="cb13-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>))</span>
<span id="cb13-6">  )</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>That point represents a single position, which we’d understand through our coordinate reference system as being a certain distance away from a reference point.</p>
<p>Similarly, this is what vectors in R are abstracting. Imagine that, instead of using sf to make this a spatial vector, we turned our matrix into a data frame and used that with ggplot2 instead:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1">our_matrix <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb14-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as.data.frame</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb14-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">setNames</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"x"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb14-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(x, y)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb14-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_line</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb14-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb14-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Distance away from the origin in this direction"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb14-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-11-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Just like with sf, we’re representing vectors by the places they start and end. And because we’ve plotted this as a line, we’re able to tell the position of our data at each distance along either the X or Y axis, within this coordinate reference system.</p>
<p>Or take for instance the <code>age</code> vector inside the <code>Orange</code> data frame:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1">Orange<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>age <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">head</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>[1]  118  484  664 1004 1231 1372</code></pre>
</div>
</div>
<p>According to <code>?Orange</code>, this vector represents the age of the tree, in units of days since 1968-12-31. Similarly, the <code>circumference</code> vector is the circumference of each tree in millimeters. We can plot those vectors just as easily as our physics and our spatial vectors:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb17-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(Orange, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(age, circumference, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> Tree)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb17-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_x_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1700</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb17-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_continuous</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">limits =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">225</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">expand =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb17-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>()</span></code></pre></div></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://mm218.dev/posts/2023-08-07-vector/index_files/figure-html/unnamed-chunk-13-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This graph is using an abstract coordinate system – instead of “meters away from the origin”, one axis is “distance in time from a reference date”, and the other is “distance in length from not existing at all”. But just like our physics vectors, each of these points represents a magnitude in some direction. Our <code>age</code> vector is a set of magnitudes along a time axis; our <code>circumference</code> vector a set of magnitudes along a length axis.</p>
<p>Because I didn’t have a ton of formal math education, I never made the connection across these three types of vectors, and never entirely understood that they were all different ways of understanding and representing position along a coordinate plane, under some coordinate reference system. Recognizing that these different versions of “vectors” are all sharing an underlying meaning made it a lot easier for me to understand what “vector data” actually meant, and to understand the semantic difference between vector and raster representations of the same data. Hopefully this digression makes things a bit clearer for someone else, as well.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>This has been partially addressed by the newer Discussions feature.↩︎</p></li>
<li id="fn2"><p>Don’t worry, Mastodon commenter, I’m aware that rasters exist. The point is that my education here was incomplete.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>Tutorials</category>
  <guid>https://mm218.dev/posts/2023-08-07-vector/</guid>
  <pubDate>Mon, 07 Aug 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-08-07-vector/banner.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>From the inbox: How can I get fold assignments from spatialsample?</title>
  <dc:creator>Mike Mahoney</dc:creator>
  <link>https://mm218.dev/posts/2023-06-06-spatialsample_splits/</link>
  <description><![CDATA[ 





<p>In my inbox,<sup>1</sup> someone asks:<sup>2</sup></p>
<blockquote class="blockquote">
<p>I’m using <code>spatial_clustering_cv()</code> from spatialsample to do cross-validation. How can I get separate data frames with each split created by this function?</p>
</blockquote>
<p>I think this question is decently common, because a lot of the spatialsample documentation is written assuming that you’re familiar with rsample already, which is often not the case for people working with spatial data. The functions to do this sort of thing live in rsample, and aren’t (<a href="https://github.com/tidymodels/spatialsample/issues/143">currently</a>) re-exported by spatialsample, so it can be hard to find the right function.</p>
<p>First and foremost, let’s assume that you’ve got some object called <code>my_folds</code> created by <code>spatial_clustering_cv()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(spatialsample)</span>
<span id="cb1-2">my_folds <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">spatial_clustering_cv</span>(boston_canopy, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">v =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-3">my_folds</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>#  2-fold spatial cross-validation 
# A tibble: 2 × 2
  splits            id   
  &lt;list&gt;            &lt;chr&gt;
1 &lt;split [277/405]&gt; Fold1
2 &lt;split [405/277]&gt; Fold2</code></pre>
</div>
</div>
<p>The “my_folds” object that gets created should have a “splits” column, which is a list. Each element of that list contains your analysis and assessment sets.<sup>3</sup> To get a single split, use <code>rsample::get_rsplit()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">rsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_rsplit</span>(my_folds, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>&lt;Analysis/Assess/Total&gt;
&lt;277/405/682&gt;</code></pre>
</div>
</div>
<p>To get just the analysis data for that fold, use <code>rsample::analysis()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">rsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_rsplit</span>(my_folds, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb5-2">  rsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">analysis</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Simple feature collection with 277 features and 18 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 755424.9 ymin: 2935616 xmax: 812069.7 ymax: 2970073
Projected CRS: NAD83 / Massachusetts Mainland (ftUS)
# A tibble: 277 × 19
   grid_id land_area canopy_gain canopy_loss canopy_no_change canopy_area_2014
   &lt;chr&gt;       &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;            &lt;dbl&gt;            &lt;dbl&gt;
 1 AB-4      795045.      15323.       3126.           53676.           56802.
 2 AO-9      270153        6187.       1184.           26930.           28114.
 3 V-7       107890.        219.       3612.             240.            3852.
 4 X-4       848558.       8275.       1760.            6872.            8632.
 5 AC-4     2069814.      82201.      50944.          240161.          291104.
 6 AC-15    1175032.      24517.      24010.          111148.          135158.
 7 U-14     2690727.      69780.      51404.          263796.          315201.
 8 AQ-15     453368.      13971.       3401.          343677.          347077.
 9 Q-10      156688.       9237.       3094.           57327.           60421.
10 T-10      215340.      13984.       3947.           59539.           63487.
# ℹ 267 more rows
# ℹ 13 more variables: canopy_area_2019 &lt;dbl&gt;, change_canopy_area &lt;dbl&gt;,
#   change_canopy_percentage &lt;dbl&gt;, canopy_percentage_2014 &lt;dbl&gt;,
#   canopy_percentage_2019 &lt;dbl&gt;, change_canopy_absolute &lt;dbl&gt;,
#   mean_temp_morning &lt;dbl&gt;, mean_temp_evening &lt;dbl&gt;, mean_temp &lt;dbl&gt;,
#   mean_heat_index_morning &lt;dbl&gt;, mean_heat_index_evening &lt;dbl&gt;,
#   mean_heat_index &lt;dbl&gt;, geometry &lt;MULTIPOLYGON [US_survey_foot]&gt;</code></pre>
</div>
</div>
<p>Similarly, to get just the assessment data for that fold, use <code>rsample::assessment()</code>:</p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">rsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_rsplit</span>(my_folds, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span></span>
<span id="cb7-2">  rsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">assessment</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Simple feature collection with 405 features and 18 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 739826.9 ymin: 2908294 xmax: 781347.5 ymax: 2959751
Projected CRS: NAD83 / Massachusetts Mainland (ftUS)
# A tibble: 405 × 19
   grid_id land_area canopy_gain canopy_loss canopy_no_change canopy_area_2014
   &lt;chr&gt;       &lt;dbl&gt;       &lt;dbl&gt;       &lt;dbl&gt;            &lt;dbl&gt;            &lt;dbl&gt;
 1 I-33      265813.       8849.      11795.           78677.           90472.
 2 H-10     2691490.      73098.      80362.          345823.          426185.
 3 Q-22     2648089.     122211.     154236.         1026632.         1180868.
 4 P-18     2690726.     110928.     113146.          915137.         1028283.
 5 J-29     2574479.      38069.      15530.         2388638.         2404168.
 6 G-28     2641525.      87024.      39246.         1202528.         1241774.
 7 M-23     2690727.      87621.     124032.          748742.          872774.
 8 M-9      2690727.      52443.      53467.          304239.          357706.
 9 S-15     2690728.      93787.     162118.          478257.          640375.
10 Q-21     2690727.      54712.     101816.         1359305.         1461121.
# ℹ 395 more rows
# ℹ 13 more variables: canopy_area_2019 &lt;dbl&gt;, change_canopy_area &lt;dbl&gt;,
#   change_canopy_percentage &lt;dbl&gt;, canopy_percentage_2014 &lt;dbl&gt;,
#   canopy_percentage_2019 &lt;dbl&gt;, change_canopy_absolute &lt;dbl&gt;,
#   mean_temp_morning &lt;dbl&gt;, mean_temp_evening &lt;dbl&gt;, mean_temp &lt;dbl&gt;,
#   mean_heat_index_morning &lt;dbl&gt;, mean_heat_index_evening &lt;dbl&gt;,
#   mean_heat_index &lt;dbl&gt;, geometry &lt;MULTIPOLYGON [US_survey_foot]&gt;</code></pre>
</div>
</div>
<p>If you’re trying to get your original data, with a column indicating which fold each row belongs to, there’s not a provided function for that. Instead, what you can do is take the assessment set from each split (which is “what fold data is assigned to”), add a new column to it with the fold name, and then combine those assessment sets into a single data frame. I do this via the function:<sup>4</sup></p>
<div class="cell">
<div class="code-copy-outer-scaffold"><div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">purrr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map2</span>(</span>
<span id="cb9-2">  my_folds<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>splits, </span>
<span id="cb9-3">  my_folds<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>id, </span>
<span id="cb9-4">  \(split, id) <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cbind</span>(rsample<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">assessment</span>(split), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fold_name =</span> id)</span>
<span id="cb9-5">) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb9-6">  dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">bind_rows</span>()</span></code></pre></div></div>
<div class="cell-output cell-output-stdout">
<pre><code>Simple feature collection with 682 features and 19 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 739826.9 ymin: 2908294 xmax: 812069.7 ymax: 2970073
Projected CRS: NAD83 / Massachusetts Mainland (ftUS)
First 10 features:
   grid_id land_area canopy_gain canopy_loss canopy_no_change canopy_area_2014
1     I-33  265813.3    8848.818    11795.11         78676.56         90471.67
2     H-10 2691489.9   73098.168    80361.85        345823.19        426185.04
3     Q-22 2648088.6  122211.269   154236.43       1026631.85       1180868.27
4     P-18 2690726.1  110927.833   113145.85        915137.00       1028282.85
5     J-29 2574478.7   38068.676    15529.73       2388638.19       2404167.92
6     G-28 2641525.3   87024.318    39246.15       1202527.94       1241774.09
7     M-23 2690727.2   87620.730   124031.79        748742.13        872773.92
8      M-9 2690726.6   52443.164    53466.56        304239.49        357706.04
9     S-15 2690727.8   93786.589   162118.16        478257.33        640375.48
10    Q-21 2690727.2   54711.827   101815.82       1359305.11       1461120.93
   canopy_area_2019 change_canopy_area change_canopy_percentage
1          87525.38          -2946.293               -3.2565923
2         418921.35          -7263.685               -1.7043502
3        1148843.12         -32025.158               -2.7120009
4        1026064.83          -2218.014               -0.2157008
5        2426706.87          22538.944                0.9374946
6        1289552.26          47778.164                3.8475730
7         836362.86         -36411.060               -4.1718776
8         356682.65          -1023.393               -0.2860988
9         572043.92         -68331.566              -10.6705469
10       1414016.94         -47103.991               -3.2238256
   canopy_percentage_2014 canopy_percentage_2019 change_canopy_absolute
1                34.03579               32.92739            -1.10840701
2                15.83454               15.56466            -0.26987600
3                44.59323               43.38386            -1.20936883
4                38.21581               38.13338            -0.08243181
5                93.38465               94.26013             0.87547604
6                47.00974               48.81847             1.80873391
7                32.43636               31.08315            -1.35320518
8                13.29403               13.25600            -0.03803406
9                23.79934               21.25982            -2.53951988
10               54.30208               52.55148            -1.75060448
   mean_temp_morning mean_temp_evening mean_temp mean_heat_index_morning
1           74.26247          83.87540  90.85933                75.63458
2           74.64432          84.96917  91.71625                75.86767
3           73.19889          82.29358  89.70302                74.47757
4           73.77269          84.29003  91.26480                75.03802
5           72.26419          79.77278  88.70229                73.65608
6           73.60919          82.80297  90.33156                74.96955
7           74.24167          83.34713  90.41143                75.66013
8           76.74740          84.69933  91.96502                77.91048
9           75.18260          84.85431  92.00132                76.39949
10          73.37669          82.38064  90.59503                74.63029
   mean_heat_index_evening mean_heat_index fold_name
1                 89.71880        96.70939     Fold1
2                 89.88733        96.19667     Fold1
3                 87.34062        95.53811     Fold1
4                 88.93811        96.43569     Fold1
5                 81.32060        95.56059     Fold1
6                 88.47864        96.82653     Fold1
7                 89.23434        96.05418     Fold1
8                 90.02009        96.14348     Fold1
9                 89.91342        96.92160     Fold1
10                86.90021        96.23439     Fold1
                         geometry
1  MULTIPOLYGON (((752945.6 29...
2  MULTIPOLYGON (((751419.1 29...
3  MULTIPOLYGON (((763631.7 29...
4  MULTIPOLYGON (((763122.9 29...
5  MULTIPOLYGON (((753963.4 29...
6  MULTIPOLYGON (((749383.6 29...
7  MULTIPOLYGON (((758543.1 29...
8  MULTIPOLYGON (((758543.1 29...
9  MULTIPOLYGON (((767702.6 29...
10 MULTIPOLYGON (((764649.4 29...</code></pre>
</div>
</div>
<p>I think it would make sense for <code>get_rsplit()</code>, <code>analysis()</code>, and <code>assessment()</code> to get ported over to spatialsample, to make it a bit easier for the folks whose first point-of-entry into tidymodels work is via spatialsample. I’ve got a <a href="https://github.com/tidymodels/spatialsample/issues/143">GitHub issue</a> to remind me to look into that before the package’s next release.</p>




<div id="quarto-appendix" class="default"><section id="footnotes" class="footnotes footnotes-end-of-document"><h2 class="anchored quarto-appendix-heading">Footnotes</h2>

<ol>
<li id="fn1"><p>I want to mention that I include a link to <a href="https://yihui.org/en/2017/08/so-gh-email/">Yihui Xie’s excellent blog post</a> in replies to help questions sent via email. I love seeing people use my packages, and I love helping people use them, but I don’t always have the time to give 1:1 help via email. If you post a question somewhere publicly, then other people might give an even better answer; if no one answers in a day or two, then email me the link, so I can answer it publicly and have a link to send the next person with the same question as you. That’s also why I turned this into a blog post – so that I can send others with the same question a pre-written answer!↩︎</p></li>
<li id="fn2"><p>Anonymized and heavily paraphrased.↩︎</p></li>
<li id="fn3"><p>Sometimes called training and testing, respectively – rsample uses the analysis/assessment terminology to make it clear that all of this data should be in your training set, and doesn’t touch your final held-out test set.↩︎</p></li>
<li id="fn4"><p>This is how spatialsample’s <code>autoplot()</code> methods do it, for instance.↩︎</p></li>
</ol>
</section></div> ]]></description>
  <category>R</category>
  <category>spatialsample</category>
  <category>tidymodels</category>
  <category>R packages</category>
  <category>geospatial data</category>
  <category>Tutorials</category>
  <guid>https://mm218.dev/posts/2023-06-06-spatialsample_splits/</guid>
  <pubDate>Tue, 06 Jun 2023 00:00:00 GMT</pubDate>
  <media:content url="https://mm218.dev/posts/2023-06-06-spatialsample_splits/map.jpg" medium="image" type="image/jpeg"/>
</item>
</channel>
</rss>
