For fs – Notes from a data witch

– okay i’m doing it. gonna write a post about the fs package
– but why tho?
– because it’s cool and fun
– girl it’s literally a tool for interacting with the file system within a statistical programming language. in no possible universe is this cool, fun, or exciting. you are the absolute worst engagement farmer in the history of engagement farming
– pfft, everyone’s gonna love it babe, you’ll see. this shit is girt af
– hon “girt” died the moment the olympics ended. it is as passé as a raygun meme. we’re all doing moo deng now, so all you’re accomplishing here is giving away just how long this post has been in drafts. but hey you do you. i just wanna see how this ends…

Um. So.

Now that my alter ego has brutally shattered all my hopes and dreams for this blog post, I suppose I should introduce it properly? This is a post about the fs R package, and as I tried to explain to that bitch who lives in the back of my head and kills every source of love and joy in my world, it provides a cross-platform, uniform interface to file system operations. It is, in a sense, a drop-in replacement for base R functions like file.path(), list.files() and so on. It supplies a cleaner and more consistent interface to these functions, and provides a few other handy tools that aren’t available in the base R equivalents. It’s a

– no sorry gonna have to step in here. where are you going with this? you just lost like 99% of your possible readership the moment you plagiarised from the package documentation. wtf girl

lovely little package. Sheesh. She’s not very nice, is she?

Why should anyone care about this?

A package like fs is hard to write about. It’s not a sparkly shiny thing, it’s not the new hotness, and low-level utility tooling is… well, it’s inherently boring. In everyday life, about 90% of my usage of the fs package is just me using the path() function to specify a file path, and… okay, let’s load the package and see what that looks like, yeah?

library(fs)
path("hello", "world.txt")

hello/world.txt

In this example I’m passing two input strings, "hello" and "world.txt", and from these the path() function has concatenated them using an appropriate separator character so that the output can be interpreted as a (relative) path to a file. Even in the world of statistical programming it is hard to think of anything you could do with a computer that is less exciting than this. It is undeniably, unequivocally, unbelievably dull.

– no shit
– do you mind? i’m trying to set the stage here
– fine. please, do continue. this should be good

Interruptions notwithstanding, the thing about boring tasks is that they’re often neglected. Especially if those boring things look simple. And to be fair, neglecting a thing that looks simple is probably an okay thing to do if it actually is simple, but it can create problems if there are hidden nuances underneath the apparently-simple appearance. Interacting with the file system is one of those things. I cannot count the number of times when I’ve encountered code that looks like this:

# I am setting up my project now...
dir1  <- "project_root/a_folder/another_folder"
file <- "file.ext" 

# ...and 3000 lines of code later I do this
paste(dir1, file, sep = "")

[1] "project_root/a_folder/another_folderfile.ext"

Obviously, this is not the result we wanted. Unlike path(), the paste() function is not specifically designed to construct file paths: it’s a tool for concatenating strings and it makes no distinction between strings that look like file paths and strings that don’t.

When looking at a code extract like the one above, it’s so very easy to think “well I would never be that dumb”, but the “3000 lines later” part is crucial here. The moment any project starts to grow in scope – be it a developer project or an analysis project – you reach a point where the code base is large enough that you can’t remember what you did earlier, and it is horrifyingly easy to be exactly that stupid.

On top of that, if you happened to be lucky enough not to make the error above, there’s a pretty good chance you’ll mess up the other way and write code like this:

# this time I terminate my folder path with a slash...
dir2 <- "project_root/a_folder/"

# ...and 3000 lines later I forget I did that
paste(dir2, file, sep = "/")

[1] "project_root/a_folder//file.ext"

In a way, this is a worse mistake. In the first example you’ll definitely notice the problem because your code breaks the moment you try to work with a file that doesn’t exist, and R will throw an error. But the second one won’t do that. It’s a valid path to the relevant file, so your code will work just fine… right up to the point that you try to write a regular expression that operates on paths and that extra slash breaks something. Worse yet, if your project has expanded to the point that you’re writing a regex to operate on vectors of paths you can be entirely certain you’ve lost tract of the initial mistake that created the problem, and you’re trapped in debugging mode for an hour and a half trying to work out where you went wrong.

ASK. ME. HOW. I. KNOW.

In any case, the point in all this is that human working memory capacity is about 7 plus or minus 2 “chunks” of meaninful information:¹ we literally do not have the ability to hold a lot of code in our mind at once. So if you manage your file paths using paste() I guarantee you that you will mess this up eventually. Not because you’re stupid, but because you are human.

Wouldn’t it be nice if we had a function… let’s call it path()… that protects us from this particular kind of human frailty? Of course it would.

Lorem ipsum text

Now that we’ve established some motivation for caring about this topic

– lol. lmao even
– oh hush

it will be convenient to have a little tool that generates lorem ipsum text that we can write to files that we’ll then manipulate using fs. To that end, I’ll define a lorem_text() function that uses the lorem and withr packages to reproducibly generate paragraphs of lorem ipsum text:

lorem_text <- function(seed, paragraphs, digit_pad = 3) {
  lorem <- withr::with_seed(seed, lorem::ipsum(paragraphs))
  names(lorem) <- purrr::imap_chr(lorem, \(x, y) paste(
    stringr::str_pad(seed, width = digit_pad, pad = "0"),
    stringr::str_pad(y, width = digit_pad, pad = "0"),
    stringr::str_to_lower(stringr::str_remove_all(x, " .*$")),
    sep = "_"
  ))
  lorem
}

To call this function we pass the seed value for the random number generator to use, and the number of paragraphs of lorem ipsum text to create:²

lorem_text(seed = 999, paragraphs = 3)

Amet arcu suscipit donec cras inceptos rhoncus varius hac? Sociosqu luctus iaculis; ut sociosqu porta risus tristique phasellus? Duis porta in placerat phasellus class. Dapibus nostra ac, aptent nam tempus mattis eleifend metus.

Lorem molestie in elementum nascetur scelerisque cum pulvinar felis massa. Fringilla est tortor auctor nulla tristique ac mi commodo vitae. Torquent sodales eget lacinia quam elementum sodales. Cras porttitor iaculis curae eleifend fringilla pellentesque nascetur. Integer ligula per dignissim, dapibus feugiat nullam urna tristique viverra sociis felis etiam.

Sit posuere suscipit accumsan mus curabitur nullam, etiam scelerisque donec justo libero posuere. Mattis curae et litora.

Though not obvious from the printed output, the data structure that this function returns is a named list under the hood:

names(lorem_text(seed = 999, paragraphs = 3))

[1] "999_001_amet"  "999_002_lorem" "999_003_sit"

The names here follow a regular pattern: they contain the seed number, the paragraph number, and the first word in the paragraph, separated by underscores. In the examples below, these names become file names, and the text in each paragraph become the content written to the various files.

– thrilling
– so you’re just set on doing this? you’re going to snipe at me the whole way through?
–
–
–
–
– yes.

Building paths

So as I was saying earlier, about 90% of my usage of the fs package is via the path() function used to construct file paths, so it’s the natural place to start. Here’s a very simple example that specifies the path from my blog root to this quarto document:

path("posts", "2024-09-15_fs", "index.qmd")

posts/2024-09-15_fs/index.qmd

I’m building this post on linux, so paths are separated by the "/" character. If I were building on windows, I’d get a different result.

The path() function is vectorised and doesn’t require that the paths in question actually exist on the machine, so I can do something like this to define a vector of paths that I can work with later on:

lorem <- lorem_text(seed = 1, paragraphs = 20)
lorem_paths <- path("lorem", names(lorem))
lorem_paths

lorem/001_001_consectetur lorem/001_002_ipsum       
lorem/001_003_elit        lorem/001_004_lorem       
lorem/001_005_consectetur lorem/001_006_sit         
lorem/001_007_sit         lorem/001_008_adipiscing  
lorem/001_009_adipiscing  lorem/001_010_lorem       
lorem/001_011_lorem       lorem/001_012_adipiscing  
lorem/001_013_amet        lorem/001_014_sit         
lorem/001_015_elit        lorem/001_016_ipsum       
lorem/001_017_adipiscing  lorem/001_018_consectetur 
lorem/001_019_amet        lorem/001_020_consectetur

These are relative paths, and since (by default) quarto blog posts are executed with the working directory set to the folder that contains the document, these paths are implicitly taken relative to this folder.

– wow how exci…
– shut up, nobody wants to hear from you
– uh huh

File system operations

The second most common thing I find myself using the fs package for is basic file system operations: creating files and folders, copying, deleting, and moving files, etc. For example, the paths I specified in the previous section all refer to a folder called “lorem”, but that folder doesn’t currently exist. Indeed, I can verify that no such folder exists using the dir_exists() function”:

dir_exists("lorem")

lorem 
FALSE

That’s handy to know, because I actually do want this folder to exist, and fortunately I can create the folder I want from R by using dir_create(), and then verify that it now exists:

dir_create("lorem")
dir_exists("lorem")

lorem 
 TRUE

Like all functions in fs, these are vectorised operations. For example, I can test for the existence of multiple folders at once like this:

dir_exists(c("lorem", "ipsum"))

lorem ipsum 
 TRUE FALSE

In any case, now that the “lorem” directory exists, I can use file_create() to create the files listed in the lorem_paths vector I defined earlier. Again, file_create() is vectorised, so I can pass the vector of file names directly with no need to write a loop:

file_create(lorem_paths)

Though there is no output printed to the console, all the files I requested have now been created. To see this, I can use dir_ls() to return a vector containing all the file names within a specified folder:

dir_ls("lorem")

lorem/001_001_consectetur lorem/001_002_ipsum       
lorem/001_003_elit        lorem/001_004_lorem       
lorem/001_005_consectetur lorem/001_006_sit         
lorem/001_007_sit         lorem/001_008_adipiscing  
lorem/001_009_adipiscing  lorem/001_010_lorem       
lorem/001_011_lorem       lorem/001_012_adipiscing  
lorem/001_013_amet        lorem/001_014_sit         
lorem/001_015_elit        lorem/001_016_ipsum       
lorem/001_017_adipiscing  lorem/001_018_consectetur 
lorem/001_019_amet        lorem/001_020_consectetur

Okay, that’s nice, but I don’t actually want a folder full of empty files. So let’s delete the folder and everything in it. That’s easy enough to do with dir_delete()

dir_delete("lorem")

And just like that, the files and folder are gone. Alternatively, if I had wanted only to delete some of the files I could have used file_delete() to be a little more selective!

File trees

Okay that’s handy. As a slightly fancier example, though, let’s try creating files with a little more structure to them. Rather than write each of the lorem files to the same directory, we can place them in subfolders based on the first word in the lorem text. To do that, I’ll need to create these directories:

lorem_dirs <- unique(stringr::str_remove(names(lorem), "^.*_"))
lorem_dirs

[1] "consectetur" "ipsum"       "elit"        "lorem"       "sit"        
[6] "adipiscing"  "amet"

However, I don’t want to create these as top level folders: my file structure could become a mess if I do that. Instead, I’ll create them as subfolders of a “nonsense” folder. I can do this with a single call to dir_create():

dir_create(path("nonsense", lorem_dirs))

This command creates the “nonsense” folder itself, and populates it with all the subfolders listed in lorem_dirs. To see this displayed as a nice file tree, I’ll use the dir_tree() function:

dir_tree("nonsense")

nonsense
├── adipiscing
├── amet
├── consectetur
├── elit
├── ipsum
├── lorem
└── sit

Having created a nested directory structure, I can now define the paths to which I want to write files:

lorem_paths <- path(
  "nonsense", 
  stringr::str_remove(names(lorem), "^.*_"), 
  names(lorem)
)
lorem_paths

nonsense/consectetur/001_001_consectetur
nonsense/ipsum/001_002_ipsum
nonsense/elit/001_003_elit
nonsense/lorem/001_004_lorem
nonsense/consectetur/001_005_consectetur
nonsense/sit/001_006_sit
nonsense/sit/001_007_sit
nonsense/adipiscing/001_008_adipiscing
nonsense/adipiscing/001_009_adipiscing
nonsense/lorem/001_010_lorem
nonsense/lorem/001_011_lorem
nonsense/adipiscing/001_012_adipiscing
nonsense/amet/001_013_amet
nonsense/sit/001_014_sit
nonsense/elit/001_015_elit
nonsense/ipsum/001_016_ipsum
nonsense/adipiscing/001_017_adipiscing
nonsense/consectetur/001_018_consectetur
nonsense/amet/001_019_amet
nonsense/consectetur/001_020_consectetur

For each path in the lorem_paths vector, and each passage of text in the lorem object, we can write the text to the corresponding file like this:

purrr::walk(
  seq_along(lorem),
  \(x) brio::write_lines(
    text = lorem[[x]],
    path = lorem_paths[x]
  )
)

The file tree now looks like this:

dir_tree("nonsense")

nonsense
├── adipiscing
│   ├── 001_008_adipiscing
│   ├── 001_009_adipiscing
│   ├── 001_012_adipiscing
│   └── 001_017_adipiscing
├── amet
│   ├── 001_013_amet
│   └── 001_019_amet
├── consectetur
│   ├── 001_001_consectetur
│   ├── 001_005_consectetur
│   ├── 001_018_consectetur
│   └── 001_020_consectetur
├── elit
│   ├── 001_003_elit
│   └── 001_015_elit
├── ipsum
│   ├── 001_002_ipsum
│   └── 001_016_ipsum
├── lorem
│   ├── 001_004_lorem
│   ├── 001_010_lorem
│   └── 001_011_lorem
└── sit
    ├── 001_006_sit
    ├── 001_007_sit
    └── 001_014_sit

File information

Sometimes it is useful to retrieve information about a file, analogous to the stat command on linux systems. From the terminal, you’d get output that looks like this:³

system("stat nonsense/lorem/001_004_lorem")

  File: nonsense/lorem/001_004_lorem
  Size: 474         Blocks: 8          IO Block: 4096   regular file
Device: 252,1   Inode: 1586402     Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/danielle)   Gid: ( 1000/danielle)
Access: 2024-10-06 14:10:44.468497699 +1100
Modify: 2024-10-06 14:10:44.468497699 +1100
Change: 2024-10-06 14:10:44.468497699 +1100
 Birth: 2024-10-06 14:10:44.468497699 +1100

The file_info() function in the fs package mirrors this behaviour, all nicely vectorised so you can pass a vector of file paths, and with output organised into a tidy little tibble to make it easy to work with programmatically:

file_info(lorem_paths[1:4])

# A tibble: 4 × 18
  path                                     type         size permissions modification_time   user     group    device_id hard_links special_device_id   inode block_size blocks flags generation access_time         change_time         birth_time         
  <fs::path>                               <fct> <fs::bytes> <fs::perms> <dttm>              <chr>    <chr>        <dbl>      <dbl>             <dbl>   <dbl>      <dbl>  <dbl> <int>      <dbl> <dttm>              <dttm>              <dttm>             
1 nonsense/consectetur/001_001_consectetur file          307 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586371       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
2 nonsense/ipsum/001_002_ipsum             file          322 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586372       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
3 nonsense/elit/001_003_elit               file          397 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586401       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
4 nonsense/lorem/001_004_lorem             file          474 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586402       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44

There is an analogous function dir_info() that can be applied to a directory, and the output is structured the same way:

dir_info("nonsense", recurse = TRUE)

# A tibble: 27 × 18
   path                                     type             size permissions modification_time   user     group    device_id hard_links special_device_id   inode block_size blocks flags generation access_time         change_time         birth_time         
   <fs::path>                               <fct>     <fs::bytes> <fs::perms> <dttm>              <chr>    <chr>        <dbl>      <dbl>             <dbl>   <dbl>      <dbl>  <dbl> <int>      <dbl> <dttm>              <dttm>              <dttm>             
 1 nonsense/adipiscing                      directory          4K rwxr-xr-x   2024-10-06 14:10:44 danielle danielle     64513          2                 0 1586368       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 2 nonsense/adipiscing/001_008_adipiscing   file              415 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586419       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 3 nonsense/adipiscing/001_009_adipiscing   file              370 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586420       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 4 nonsense/adipiscing/001_012_adipiscing   file              254 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586514       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 5 nonsense/adipiscing/001_017_adipiscing   file              393 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1589481       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 6 nonsense/amet                            directory          4K rwxr-xr-x   2024-10-06 14:10:44 danielle danielle     64513          2                 0 1586369       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 7 nonsense/amet/001_013_amet               file              367 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586515       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 8 nonsense/amet/001_019_amet               file              313 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1589484       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
 9 nonsense/consectetur                     directory          4K rwxr-xr-x   2024-10-06 14:10:44 danielle danielle     64513          2                 0 1586360       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
10 nonsense/consectetur/001_001_consectetur file              307 rw-rw-r--   2024-10-06 14:10:44 danielle danielle     64513          1                 0 1586371       4096      8     0          0 2024-10-06 14:10:44 2024-10-06 14:10:44 2024-10-06 14:10:44
# ℹ 17 more rows

That being said, it’s pretty uncommon in my experience to need all that information. Often the only thing you’re interested in is the file size column, and the fs package provides a convenience function that extracts that information for you:

file_size(lorem_paths)

307 322 397 474 196 599 501 415 370 135 228 254 367 360 357 375 393 552 313 523

– wha… oh, huh, i guess i lost consciousness there. so you’re still talking huh?
– sweetie you’re literally a figment of my imagination you don’t have a consciousness to lose
– bitch
– ¯\_(ツ)_/¯

Path arithmetic

My absolute favourite collection of functions within fs are the ones that can be used to perform “path arithmetic”, by which what I really mean is handy string manipulations for common tasks that save me from the horrior of writing a regular expression. Because, like all right-thinking people, I loathe regular expressions with a passion I usually reserve for real estate agents and people who don’t pick up after their dogs.

To illustrate the idea, let’s think about some common tasks we might need to perform using the lorem_paths vector:

lorem_paths

nonsense/consectetur/001_001_consectetur
nonsense/ipsum/001_002_ipsum
nonsense/elit/001_003_elit
nonsense/lorem/001_004_lorem
nonsense/consectetur/001_005_consectetur
nonsense/sit/001_006_sit
nonsense/sit/001_007_sit
nonsense/adipiscing/001_008_adipiscing
nonsense/adipiscing/001_009_adipiscing
nonsense/lorem/001_010_lorem
nonsense/lorem/001_011_lorem
nonsense/adipiscing/001_012_adipiscing
nonsense/amet/001_013_amet
nonsense/sit/001_014_sit
nonsense/elit/001_015_elit
nonsense/ipsum/001_016_ipsum
nonsense/adipiscing/001_017_adipiscing
nonsense/consectetur/001_018_consectetur
nonsense/amet/001_019_amet
nonsense/consectetur/001_020_consectetur

The most common task that I have to do regularly with paths like this is extract the file name. Under other circumstances I’d have to spend time asking myself “are these paths correctly formatted?” and “god, how do I write a basic regex again????” but thankfully the path_file() functions saves me from this terrible fate:

path_file(lorem_paths)

 [1] "001_001_consectetur" "001_002_ipsum"       "001_003_elit"       
 [4] "001_004_lorem"       "001_005_consectetur" "001_006_sit"        
 [7] "001_007_sit"         "001_008_adipiscing"  "001_009_adipiscing" 
[10] "001_010_lorem"       "001_011_lorem"       "001_012_adipiscing" 
[13] "001_013_amet"        "001_014_sit"         "001_015_elit"       
[16] "001_016_ipsum"       "001_017_adipiscing"  "001_018_consectetur"
[19] "001_019_amet"        "001_020_consectetur"

Analogously, if I need to extract the directory name and ignore the file name, I could waste precious seconds of my life thinking about this tedious task using first principles, or I could simply use path_dir() to do this:

path_dir(lorem_paths)

 [1] "nonsense/consectetur" "nonsense/ipsum"       "nonsense/elit"       
 [4] "nonsense/lorem"       "nonsense/consectetur" "nonsense/sit"        
 [7] "nonsense/sit"         "nonsense/adipiscing"  "nonsense/adipiscing" 
[10] "nonsense/lorem"       "nonsense/lorem"       "nonsense/adipiscing" 
[13] "nonsense/amet"        "nonsense/sit"         "nonsense/elit"       
[16] "nonsense/ipsum"       "nonsense/adipiscing"  "nonsense/consectetur"
[19] "nonsense/amet"        "nonsense/consectetur"

Much easier, and frankly more reliable, than trying to do the job myself.

There’s even a path_common() function that returns the part of the path that is shared by all paths in the vector. I’ll admit I don’t use that one as often, but it’s kind of nice that the package supplies this. I appreciate the attention to detail involved in recognising that sometimes you do actually need this:

path_common(lorem_paths)

nonsense

Sure, I already knew that “nonsense” is the folder containing all these files because I designed this little exercise that way, but still pretty handy, especially when you combine it with path_abs() that converts a relative path to an absolute path to find the actual location on my machine that contains all these files:

lorem_paths |> 
  path_common() |> 
  path_abs()

/home/danielle/GitHub/djnavarro/blog/posts/2024-10-06_fs/nonsense

You can also call path_split() to split paths into a list of character vectors, where each such vector contains one element per level in the file hierarchy. This behaves analogously to base split() or stringr::str_split(), but automatically splits using the relevant file separator character on your operating system:

path_split(lorem_paths[1:2])

[[1]]
[1] "nonsense"            "consectetur"         "001_001_consectetur"

[[2]]
[1] "nonsense"      "ipsum"         "001_002_ipsum"

It’s not the prettiest of outputs, but notice that you can use this as the basis for a list column in a data frame that you can then unnest with the assistance of tidyr:

lorem_paths |> 
  path_split() |> 
  tibble::tibble(level = _) |> 
  tidyr::unnest_wider(col = "level", names_sep = "_")

# A tibble: 20 × 3
   level_1  level_2     level_3            
   <chr>    <chr>       <chr>              
 1 nonsense consectetur 001_001_consectetur
 2 nonsense ipsum       001_002_ipsum      
 3 nonsense elit        001_003_elit       
 4 nonsense lorem       001_004_lorem      
 5 nonsense consectetur 001_005_consectetur
 6 nonsense sit         001_006_sit        
 7 nonsense sit         001_007_sit        
 8 nonsense adipiscing  001_008_adipiscing 
 9 nonsense adipiscing  001_009_adipiscing 
10 nonsense lorem       001_010_lorem      
11 nonsense lorem       001_011_lorem      
12 nonsense adipiscing  001_012_adipiscing 
13 nonsense amet        001_013_amet       
14 nonsense sit         001_014_sit        
15 nonsense elit        001_015_elit       
16 nonsense ipsum       001_016_ipsum      
17 nonsense adipiscing  001_017_adipiscing 
18 nonsense consectetur 001_018_consectetur
19 nonsense amet        001_019_amet       
20 nonsense consectetur 001_020_consectetur

Note that this trick also works when the paths are of different lengths. For example, suppose I were to use dir_ls() to return the complete list of all files and folders contained within the “nonsense” folder, some of the paths will be length 2 rather than length 3, because the folder paths are also included in the output. Because unnest_wider() is able to handle ragged list columns, you get this as the output:

dir_ls("nonsense", recurse = TRUE) |> 
  path_split() |> 
  tibble::tibble(level = _) |> 
  tidyr::unnest_wider(col = "level", names_sep = "_")

# A tibble: 27 × 3
   level_1  level_2     level_3            
   <chr>    <chr>       <chr>              
 1 nonsense adipiscing  <NA>               
 2 nonsense adipiscing  001_008_adipiscing 
 3 nonsense adipiscing  001_009_adipiscing 
 4 nonsense adipiscing  001_012_adipiscing 
 5 nonsense adipiscing  001_017_adipiscing 
 6 nonsense amet        <NA>               
 7 nonsense amet        001_013_amet       
 8 nonsense amet        001_019_amet       
 9 nonsense consectetur <NA>               
10 nonsense consectetur 001_001_consectetur
# ℹ 17 more rows

Epilogue

So anyway, that’s about everything I wanted to talk about. It’s not an exhaustive listing of course, and there are a variety of other helper functions in fs, some of which I very occasionally make use of. For instance, you can use file_chmod() to change file permissions, file_touch() to change file access and modification time metadata, file_temp() to create a temporary file, and so on. I find I don’t use these as often, but I’m glad they exist.

– thats nice hon but seriously, why write any of this? i don’t see the point
– isnt it enough that i wanted to write it? i mean, why else do we even have a blog, if not to write about whatever we feel like writing about? if other people want to read it, good for them, and if they dont… also good for them. ffs, the whole idea of “blogging as thought leadership” needs to die in a fire
– so i guess we really are here to fuck spiders huh?
– always, babe. always

Footnotes

No I absolutely will not be going into details about the subtle differences in working memory capacity as a function of modality and age, or the nuances about what precisely comprises as chunk, or whatever in the well-actually fuck you want to nitpick. Do I look like a cognitive scientist to you?↩︎
In the interests of tranparency I should mention that if you tried this code as-is within a quarto or R markdown document, it wouldn’t necessarily be displayed in italics like this. That’s a personal affectation I added in this post to more clearly delineate the end of the R output from the start of the markdown text.↩︎
Sigh. I’m hiding something. If you do this command at the R console, you will indeed get the output shown below. However, if you try to do this from within R markdown or quarto you will not. This is because the output you see here reflects the system stdout, which is different from the R console stdout, and if you want to capture this within the HTML document you have to do something a little fancier, setting intern = TRUE to ensure system() returns the terminal output as a character vector that you can then print to the R console in the usual way with cat(). See this discussion on stackoverflow.↩︎

Reuse

CC BY 4.0

Citation

BibTeX citation:

@online{navarro2024,
  author = {Navarro, Danielle},
  title = {For Fs},
  date = {2024-10-06},
  url = {https://blog.djnavarro.net/posts/2024-10-06_fs/},
  langid = {en}
}

For attribution, please cite this work as:

Navarro, Danielle. 2024. “For Fs.” October 6, 2024. https://blog.djnavarro.net/posts/2024-10-06_fs/.