One of the really fun things about switching to Linux is realizing just how much you can accomplish with a few lines of Bash. Especially when those lines are taking advantage of, not just built-in programs, but other things you made before.
Today I published a couple of tiny tools I use to make my life easier, and I'd like to talk about them a little bit.
Basic functionality
Let me set the stage first. Let's imagine that you have access to some magic-tool that checks the current folder name, decides on some content that folder should have, reaches out to the Internet and retrieves it. (Perhaps that is, itself, a wrapper for a more simple-minded application that has to be given a URL directly.) And let's suppose there are a ton of such folders you want to populate.
You duly start out:
$ mkdir first-folder && cd first-folder # Good, it worked; so now I should... $ magic-tool $ cd .. $ mkdir second-folder && cd second-folder # I think I have the hang of this... $ magic-tool $ cd ..
Hmm, that's going to get repetitive. Better if we could do it in a single step. Thinking a bit more, you come up with using a subshell with a single pipeline:
$ (mkdir third-folder && cd third-folder && magic-tool)
Nice; magic-tool only runs when the folder creation is successful, and we don't need to cd .. — because only the subshell's working directory was affected.
But now to set up fourth-folder, fifth-folder etc. we'll need to up-arrow to edit the previous command, and edit the folder name in two places before runnign again. Better if we only had to do our typing once — which is easy enough with a for loop if we know the folder names ahead of time:
$ for folder in fourth-folder fifth-folder; do > (mkdir $folder && cd $folder && magic-tool) > done
Okay, but what if we're coming up with folder names on the fly? One way is to make a function:
$ get-content() { (mkdir $1 && cd $1 && magic-tool) } $ get-content sixth-folder # Ah, that's much nicer.
Great; we learned something and made something useful for the future...
Persistence
Except, suppose a few minutes later we close the terminal window (deliberately or not). Or maybe we were working in a subshell (for example, while browsing around with ncdu, which I highly recommend by the way) and exited out of that. Our get-content function is gone! And it also temporarily disappears, by default, if we go into a subshell. (We can fix that with export -f, but it doesn't solve the other problems.)
The obvious fix is to add the function to a ~/.bashrc, ~/.bash_aliases or similar file, so that it gets re-created as needed. But, well, hindsight is 20/20, and at the time, that would have been work. In some simple cases we might be able to get the quoting right and do it all on the command line:
$ echo 'get-content() { (mkdir $1 && cd $1 && magic-tool) }' >> ~/.bashrc
... At least, as long as the file already ended with a newline. And as long as the file is writable (I like to leave mine read-only by default, as a security measure; it doesn't do much, but lazily-written malware might not check, and therefore might fail to write some persistent startup hook to ~/.bashrc while I'm not looking because it didn't chmod u+w first).
More generally, we might want to open up the file in an editor, having copied the command to the clipboard, and then try to remember how to paste it in Vim again ("+p, just for reference), and then do a bit more editing... and saving... and re-chmodding...
... With that much friction, I ended up forgetting about it a lot of the time. And late last year, I had gotten fed up of getting burned by that.
Just type it for me, please
To fix the problem, I decided to make a Python script to automate the writing and saving step. The idea was, I could use the type builtin to output the function (with a bit of decoration) and pipe that to a script, which would parse it a bit and then write out a Bash script file. I already had ~/.local/bin set up on my PATH, so that was a natural place both for the Python script and its output.
For those unfamiliar with type, let's go back in time a bit and see how it works:
$ type get-content get-content is a function get-content () { ( mkdir $1 && cd $1 && magic-tool ) }
So, there's plenty of information we can use to verify that the input came from typeing a function, and it's also straightforward to extract the function body. So now I could type get-content | func2cmd, and get an executable script at ~/.local/bin/get-content like:
#!/bin/bash ( mkdir $1 && cd $1 && magic-tool )
Now it's not just persistent; I didn't have to bypass my own write-protection, and I have an independent file I can easily show off to (or share with) others. Actually, since this will run in a separate Bash instance anyway, the parentheses are unnecessary; they'll waste a couple milliseconds, but I could also just edit them out of the file later. More importantly, I won't be punished for forgetting them by ending up with an unexpected current working directory.
Tariffs
The ergonomics of the above are still not optimal. So I thought of using func2cmd itself to make a make-command command that just does type $1 | func2cmd. But that doesn't really work, because that command runs in its own instance, like I was just saying, so of course when it invokes type it can't see the function. I would need to export -f before using make-command, and of course it doesn't work to add that to the make-command code (I think I actually did try that at some point). Catch-22. So there's a bit of friction with our exporting.
After an embarrassingly long period of confusion and ugly half-working hacks, I realized that the solution was to relent and put this part, as a function, in ~/.bash_aliases. (But there's also a lot of older stuff in there that I really ought to dump out into scripts instead...)
Adding Hubris to Laziness and Impatience
After a while of using this script, I decided to add a /usr/local/bin/share-command which would copy my user-level scripts into /usr/local/bin. The idea is, I have multiple user accounts on my computer, so that I can associate a suite of Internet accounts (email, chat clients, websites...) with each and thus readily assume a different "identity". (No, I'm not telling you any other account names. The point is to be able to use the Internet without association to my real name, like this username has.) By making a system-wide installation, I could switch accounts without losing access to my most useful commands.
Then, a few days ago, I got the urge to rewrite func2cmd in Bash, and then publish it all on GitHub and tell you about it here (and on Hacker News). It's all under the Unlicense because I can't even really see a point in asserting copyright; really anyone could have come up with this and it's... honestly pretty rough. Hopefully you won't have too much difficulty adapting anything in the repository to your own purposes.
And Now for Something Completely Different
I'd like to take a moment to plug the excellent (and fairly popular) pyp tool, by Python core developer Shantanu Jain. (I wish I were so cool as to have thought of an anagram of my real name to use as a username. Alas.)
Part of what motivated the Bash rewrite of func2cmd was the most recent time I used it. Which started out with an HN post a week ago. Someone had posted some interesting-looking Unicode and I wanted to know what character I was looking at:
$ python -c 'import unicodedata; print(unicodedata.name("ന്ന"))' Traceback (most recent call last): File "<string>", line 1, in <module> TypeError: name() argument 1 must be a unicode character, not str
Did I say character? I meant characters, apparently. (More about "grapheme clustering" in a future post, I'm sure.)
$ python -c 'import unicodedata; for c in "ന്ന": print(unicodedata.name(c))' File "<string>", line 1 import unicodedata; for c in "ന്ന": print(unicodedata.name(c)) ^^^ SyntaxError: invalid syntax
Right, Python's semicolon usage isn't that flexible.
$ python -c 'import unicodedata' 'for c in "ന്ന": print(unicodedata.name(c))'
Right, python -c can't take multiple arguments. As it turns out, a multi-line string works fine, and is easy to input in Bash given that you've already opened a single-quoted Bash string:
$ python -c 'import unicodedata > for c in "ന്ന": print(unicodedata.name(c))' MALAYALAM LETTER NA MALAYALAM SIGN VIRAMA MALAYALAM LETTER NA
But as you can see, for the actual post I did a bit of code golf instead.
And then I remembered about pyp and re-installed it, and was amazed how much easier the task could be, and edited the post to include that example.
But then later, I read the pyp documentation and realized that you can get runnable scripts out of it:
$ pyp --explain 'map(unicodedata.name, "ന്ന")' #!/usr/bin/env python3 import unicodedata import sys from pyp import pypprint assert sys.stdin.isatty() or not sys.stdin.read(), "The command doesn't process input, but input is present. Maybe you meant to use a magic variable like `stdin` or `x`?" output = map(unicodedata.name, 'ന്ന') if output is not None: pypprint(output)
Which is really amazing. Well, except for the fact that the pypprint import will only work within pyp's environment. And if you use Pipx or uv to install it, that won't normally actually be the case; what you'd want to do is use the virtual environment where pyp (really the pypyp package) was installed. Which, as we all know (we do all know this, right? Well, I have a post lined up...), doesn't require activating that environment; it just requires using the path to the environment's Python symlink in the shebang.
After a bit of hammering, I had a function which could take the output of pyp --explain and write it to a new file, but replacing the shebang line with one pointing to the correct Python. It must have been something like:
$ make-pyp() { > file=~/.local/bin/"$1"; > env="$(dirname $(readlink $(which pyp)))"; > echo "#!""$env/python" > "$file"; > pyp --explain "$2" | tail -n +2 >> "$file"; > chmod +x "$file" > }
which func2cmd promptly hammered into shape:
#!/bin/bash file=~/.local/bin/"$1"; env="$(dirname $(readlink $(which pyp)))"; echo "#!""$env/python" > "$file"; pyp --explain "$2" | tail -n +2 >> "$file"; chmod +x "$file"
Which then allowed me to do:
$ make-pyp chars 'map(unicodedata.name, " ".join(sys.argv[1:]))'
Which produced a working script where now I can do:
$ chars čeština 日本語 ❤️ LATIN SMALL LETTER C WITH CARON LATIN SMALL LETTER E LATIN SMALL LETTER S WITH CARON LATIN SMALL LETTER T LATIN SMALL LETTER I LATIN SMALL LETTER N LATIN SMALL LETTER A SPACE CJK UNIFIED IDEOGRAPH-65E5 CJK UNIFIED IDEOGRAPH-672C CJK UNIFIED IDEOGRAPH-8A9E SPACE HEAVY BLACK HEART VARIATION SELECTOR-16
Which, for my particular combination of interests, is really handy.
Meta
I've been using a template for my blog posts where I put a "meta" section (including a hit counter) right after the teaser, before getting into the main content. There have been occasions where that made sense, but I think that going forward I'll put this section at the end, and probably update my previous posts as well. If I really need to call attention to something meta, it's probably about the entire blog more than it is about the post, and as such it would deserve a separate post entirely.
Comments