Saturday, September 5, 2020

Some useful Unix utility replacements written in Rust

As I continue hurtling towards my Draft Thesis Review on the 23rd, frantically working on papers and my thesis to have ready for review, I thought I'd take a little time to show off some nifty command line utilities I came across recently. I learned about them from this useful blog post, and I'm only going to cover a few of the things listed there so be sure to check it out for yourself. I mentioned nearly a year ago that I was trying out the Xonsh shell ("konsh", like the snail), and it turns out I'm still using it (it's got some really helpful features like suggesting commands based on what you're typing and have used in the past), but these work just as well in Bash, and presumably other shells as well.

All of these utilities are written in Rust, a programming language which has been out for a decade at this point but which I only heard about for the first time in the early part of this year. It's a compiled, statically-typed language which bills itself as being like the venerable C language, but with a bunch of features which involve memory-safety built directly into it. It's not exactly widespread in use at this point, but the people who use it apparently love it, and I've been somewhat interested in learning it for a while (maybe when I have time again). Anyway, let's get to the cool new utilities, which tend to be advertised as smarter, updated replacement for traditional Unix utilities you might be familiar with already. Today I'll briefly review two of them: fd, an updated find, and sd, an updated sed.

fd is first because it's the one I've found most useful personally so far. If I were to sum it up in one sentence, it'd be: "A utility that works like I always expect find to." find is an incredibly powerful utility, there's no doubt about that, but that comes at a cost of complexity. Let's say I'm in a directory, and I know that somewhere in the directories contained within this one is a file named “add_actions.lua”. I don't remember where, though, so I try to use find to locate it:

$ find add_actions.lua
find: ‘add_actions.lua’: No such file or directory

Well, that's not very helpful. The correct way to do what I want it to do it to add a -name flag before the name of the file I want; this finds the file correctly:

$ find -name add_actions.lua
./files/scripts/add_actions.lua

I never remember this, however; just figuring this out for this example took me a few minutes of trying and reading the manual for find. I thought maybe you need to specify that you want to start searching in the directory you're currently in, by adding a period after find; this, it turns out, is unnecessary as that's the default action (so now I've wasted time remembering something superfluous). I also thought maybe I needed to specify that I was searching for things of type ‘file’ (and not, say, ‘directory’), which ultimately also turned out to be unnecessary but took me extra time to verify that that was the case. Now, the fact that the correct version doesn't require those additions does make the comparison slightly less impressive, but let's see how you would do this using fd:

$ fd add_actions.lua
files/scripts/add_actions.lua

Boom. No needing to add additional flags, it just intelligently assumes I'm giving the file name (or technically a regular expression to search against) if there are no flags or other arguments, and finds what I'm looking for by searching recursively starting from the current working directory. No looking up manuals or reading help files needed. Now, you might argue that this is a very simple example, and that's the point. I just want my computer to do what I want to do quickly so I can get back to doing whatever it was that caused me to need to find this file in the first place: fast, simple, easy, done. Much like find, fd comes with a host of options and flags which you can use to modify and specify your finding operation. I haven't looked into them deeply, and it's possible there are some use cases which find can handle which fd can't. And that's perfectly fine, computer have enough storage these days to hold both of them at once.

You might also say—in fact, I'll say it—if I used find more frequently I'd memorize its idiosyncrasies and not have this problem. And that's true, if I used it a few times per day I'd probably memorize in in a few days at most. But the fact is, I don't—I use find sporadically, perhaps every few weeks or even months, at just long enough intervals that I forget how to use it in between. (Especially if I were actually trying to perform a more complicated operation, such as only searching for files between two and four levels down created more than three weeks ago larger than 5 MB in size, for example. find can do all of that. I definitely don't remember how.)

Now that I've written it, I'm not quite sure whom this slightly long-winded apologia is directed against; die-hard find users who oppose making computer usage “too easy” for other people? (I mean, I know such people exist, but I doubt many of them read this blog.) Anyway, the basic point is that fd uses intelligent defaults to simplify your ability to find files using the command line and keep you from having to memorize specific details which serve to slow you down if you haven't. Let's look at a slightly more complicated example, using sd. Suppose I have a text file, ‘test.txt’, with the phrase “The rine in Spine falls minely on the pline,” and I'd like to correct it to a more Received English pronunciation. With sed, you could do:

sed -i s/ine/ain/g test.txt

This will change the text in the file to “The rain in Spain falls mainly on the plain.” (Which seems hydrologically unlikely if there are mountains nearby due to the rain-shadow effect, but I digress.) There are a few things of note in this command: the -i flag causes the substitutions to happen in the file, rather than merely printing the changed output to the terminal. The ‘s’ at the beginning tells sed this is a substitution. The ‘g’ at the end makes the substitution happen everywhere the pattern ‘ine’ is found, instead of just at the first location per line in the file. And the slashes work because there aren't any slashes in the text, but if there were I'd have to get creative with the symbols used to separate the before and after patterns. Contrast this with the equivalent command using sd:

sd ine ain test.txt

No flag to remember to add to make it have an actual effect on disk. No need to tell it yes, you want to change this in all locations rather than just the first one per line. And no arcane symbols separating patterns, making the whole thing much more readable, especially when you start using more complicated regular expressions with symbols in them. Now, sd is even less of a replacement for sed than fd was for find, because sed can actually do a lot more than just simple substitution; but sd acknowledges this on its GitHub page and says that it's just intended to focus on doing one thing, and doing it well (which is the Unix way, at heart!).

Anyway, that's enough to give you a taste of what these utilities can do and how they do it. I definitely suggest you check out the post I found these from and see which ones you might want to use for yourself, as there are quite a few covering a lot of different use cases. I've only actually used a few of them so far, but I've really enjoyed the ones I have, so hopefully you find something here to spice up your command line usage. (If nothing else, fd makes me actually eager to search for files using the command line, instead of reluctant like find does.) Happy computing! A hui hou!

No comments:

Post a Comment

Think I said something interesting or insightful? Let me know what you thought! Or even just drop in and say "hi" once in a while - I always enjoy reading comments.