ropen: Remote "open" command for opening remote files locally on OS X

The Problem
———–

Most Mac OS X power users know about the [“open”](http://tuvix.apple.com/documentation/Darwin/Reference/ManPages/man1/open.1.html) command line tool which opens the files specified as arguments in their default (or a specified) OS X application. Additionally, many OS X text editors, such as TextMate (“mate”) and SubEthaEdit (“see”), come with command line tools which can be used to open files.

These are great when working locally, but obviously do no work remotely. Often when working on remote servers you end up using command line editors which you may not be as familiar with.

ropen’s Solution
—————-

The [ropen](http://github.com/tlrobinson/ropen) tool solves this problem using two simple shell scripts, which make use of MacFuse’s sshfs. You run the “ropen” program on your remote machine(s) when you want to open a remote file locally (this is equivalent to the OS X “open” command). The “ropend” daemon runs on your local OS X machine waiting for open requests, and the “ropen.php” PHP script proxies requests from ropen to ropend.

How it works
————

1. When ropen is executed it makes an HTTP request to ropen.php with the paths to be opened and application to open them with, if any, as well as the SSH user, host, and port of the remote machine.
2. ropen.php stores this open request in a queue that is tied to ROPEN_SECRET via PHP’s sessions.
3. ropend polls ropen.php every 1 second waiting for open requests. When it receives one it mounts the remote filesystem using sshfs (if it’s not already mounted) and opens the files or directories specified.

More information
————

See more information about ropen on the [ropen project page](http://github.com/tlrobinson/ropen).

Determining the absolute absolute path of a shell script

In the course of working on projects like server-side Objective-J, jack, and now narwhal, I’ve often had to write shell scripts that needed to know their location in the filesystem. Rather than hardcoding it, I prefer to infer it automatically at runtime. Unfortunately this isn’t as easy as you would expect.

If the script is invoked with an absolute path (“/foo/bar/baz”) or from your PATH (“baz”), then “$0” in the script will contain the absolute of the script (“/foo/bar/baz”). However, if it is invoked using a relative path (“./bar/baz” from “/foo”) then $0 will contain the relative path (“./bar/baz”). Furthermore, if the path to the script is actually a symbolic link, you’ll get the symlink’s path instead of the original.

Surprisingly, I couldn’t find a definitive solution that handles all these cases, so I took the various ones I did find and created one which I think handles all the cases I’m aware of:

If you don’t want to resolve the symlinks remove the second half.

Using command line tools to detect the most frequent words in a file

Antonio Cangiano wrote a post about “[Using Python to detect the most frequent words in a file](http://antoniocangiano.com/2008/03/18/use-python-to-detect-the-most-frequent-words-in-a-file/)”. It’s a nice summary of how to do it in Python, but (nearly) the same thing can be accomplished by stringing together a few standard command line tools.

I’m no command line ninja, but I’d like to think I have basic command of most of the standard filters. Here’s my solution:

cat test.txt | tr -s ‘[:space:]’ ‘\n’ | tr ‘[:upper:]’ ‘[:lower:]’ | sort | uniq -c | sort -n | tail -10

I’ll explain it blow-by-blow:

cat test.txt

If you don’t know what this does you’ve got a lot to learn. “cat” simply reads files and prints them to standard output (concatenates), for use by subsequent filters.

tr -s ‘[:space:]’ ‘\n’

“tr” is a handy tool that simply translates matching characters from the first set to the corresponding character of the second set. The first instance turns all whitespace characters (spaces, tabs, newlines) into newlines (“\n”) so that each word is on a separate line (the -s option “squeezes” multiple runs of newlines into a single newline).

tr ‘[:upper:]’ ‘[:lower:]’

The second instance translates all uppercase characters into lowercase (note: the two “tr”s are separate for clarity, but they could be combined into a single one).

sort | uniq -c

“sort” and “uniq” do exactly as their names imply, but “uniq” only removes adjacent duplicates, so you often want to sort the input first. The “-c” option for “uniq” prepends each line with the number of occurrences.

sort -n

We sort the result of “uniq”, this time by numerical order (“-n”) to get the list of words in order of the number of occurrences.

tail -10

Finally, we get the 10 most frequently occurring words by using “tail” to take only the last 10 lines (since the “sort -n” puts the list in ascending order)

It’s not perfect, especially since punctuation is included in the words, but the “tr” commands can be tweaked as needed.

Presenting GCCalc: a horrible abuse of GCC

Following an [interesting discussion on Reddit](http://programming.reddit.com/info/62v70/comments) about [first class functions](http://en.wikipedia.org/wiki/First-class_function) in C, I was inspired to see what I could do with this new-found knowledge. The result is what I affectionately call “GCCalc”, for reasons that will become clear below.

GCCalc is a simple command line calculator, much like the common [bc](http://en.wikipedia.org/wiki/Bc_programming_language) calculator on many Unix systems. It’s implementation, however, is *very* different than most calculators. While bc is said to have “C-like syntax”, GCCalc’s syntax *is* C. Whatever you enter on the command line automatically gets compiled, loaded, and executed, and the result is returned (as a double) and printed to the screen.

You can either enter expressions like:

round(46.95886*sqrt(1+2/9.99*sin((21%5)*pow(2,8))))

or you can enter whole C statements (as long as they’re on one line, for now) like:

int i; for (i=0;i<10;i++) { printf("hello world!\n"); } printf("goodbye\n"); Unfortunately variables are scoped to the function that wraps them, so they don't persist across multiple entries. However, you can access the last result using the "last" variable (a double). [Here's the source file](http://tlrobinson.net/projects/gccalc/gccalc.c), and here's a syntax highlighted version: It's been tested on Mac OS X (Leopard) and Linux (Ubuntu Gutsy), with GCC 4. Compile with "gcc -o gccalc gccalc.c" on OS X, or "gcc -o gccalc gccalc.c -ldl" on Linux.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <dlfcn.h>
#include <unistd.h>

#ifdef __ELF__
#define GCC_FLAGS "-fPIC -shared"
#define EXTENSION "so"
#else
#define GCC_FLAGS "-dynamiclib"
#define EXTENSION "dylib"
#endif

#define HEADERS "#include <stdio.h>\n#include<math.h>"

typedef double(func_return_double)(double);

unsigned count = 0;
char *cwd;
char tmp_path[1024] = {‘\0’};

void *lib = NULL;

int main(int argc, char **argv)
{
    double result = 0.0;
    char input_buffer[1024], code_buffer[2048], function_name[32], command_buffer[1024];
    
    // get out current directory, which we’ll use for tmp files (dlopen seems to need absolute paths)
    cwd = getcwd(NULL, 0);
    
    while (1)
    {
        // for unique function and file names (needed for dlopen/dlsym to work correctly)
        count++;
        
        // read in the next line
        printf(">> ");
        fgets(input_buffer, sizeof(input_buffer), stdin);
    
        // format the function name
        sprintf(function_name, "f%d", count);
        
        // format the code string: if it doesn’t contain a semicolon, assume it is just an expression
        if (strchr(input_buffer, ‘;’))
            sprintf(code_buffer, "%s\ndouble %s(double last) { %s\nreturn 0; }", HEADERS, function_name, input_buffer);
        else
            sprintf(code_buffer, "%s\ndouble %s(double last) { return (%s); }", HEADERS, function_name, input_buffer);
            
        // format the filename string, delete the file if it exists
        sprintf(tmp_path, "%s/libtmp%d.%s", cwd, count, EXTENSION);
        unlink(tmp_path);
        
        // format the gcc command string
        sprintf(command_buffer, "gcc -Wall %s -x c – -o %s", GCC_FLAGS, tmp_path);
        
        // execute gcc command, write out the code
        FILE *fp = popen(command_buffer, "w");
        fwrite(code_buffer, 1, strlen(code_buffer), fp);
        fprintf(fp, "\n");
    
        // pclose waits for gcc to terminate (fclose/close do NOT thus compilation will sometimes not finish prior to the dlopen)
        pclose(fp);

        void *ptr = NULL;
        
        // open the just-compiled dynamic library
        if ((lib = dlopen(tmp_path, RTLD_NOW|RTLD_LOCAL)) == NULL) {
            puts(dlerror());
        }
        // get the function pointer
        else if ((ptr = dlsym(lib, function_name)) == NULL) {
            puts(dlerror());
        }
        
        // execute it
        if (ptr != NULL)
        {
            func_return_double *func = (func_return_double*)ptr;
            result = (*func)(result);
            // print the result
            printf("=> %.*lf\n", (result/((int)result)>1.0)?5:0, result);
        }

        // clean up: close the library, delete the temp file
        dlclose(lib);
        unlink(tmp_path);
    }

    return 0;
}

Thanks to jbert on Reddit for the initial code and inspiration.

If only I had known about this back when The Daily WTF has having their [OMG WTF](http://omg.thedailywtf.com/) crazy calculator programming contest…

multiwhich

The “which” Unix command lists the location of the first matching executable in your PATH. The GNU version of “which” has several extra features including the ability to display all matching executables in your PATH, not just the first. This is useful for finding duplicates, etc. Unfortunately, whatever version of “which” is included in Mac OS X (and MacPorts) doesn’t have these extra features.

A quick Google search didn’t turn up anything, and I was in a shell scripting mood when I needed it, so rather than downloading and compiling GNU which I whipped up my own, “multiwhich”:

#!/bin/sh

for PATHDIR in `echo $PATH | tr ":" " "`
do
    sh -c "ls -1 $PATHDIR/$1" 2> /dev/null
done

Simply put this somewhere in your PATH with execute permissions, and type “which command“.

One other accidental “feature” of this script is the ability to list every executable in your PATH. This is great for finding duplicates:

multiwhich | sort | uniq -c | sort -n

It’s probably not the most elegant way to do it, but it serves it’s purpose. Perhaps someone will find it useful…

Update: I modified the multiwhich script slightly to support wildcards like “*” and “?”. You can now do things like “multiwhich x*” to get all binaries beginning with “x”, etc.

OpenWRT on Linksys WRT54GL

As suggested by my friend and [SCEC](/projects/scecvdo) coworker [Kevin Milner](http://www.kevinmilner.net), I finally installed the GNU/Linux based OpenWRT replacement firmware on my house’s Linksys WRT54GL wireless router tonight. It gives you a minimal Linux distribution with most of the features of the WRT54GL’s original firmware built in, plus the ability to add a whole lot more.

It’s extremely easy to install. See page 10 of the [slides](http://www.isi.edu/~faber/pubs/lug-2007-03-22.pdf) [pdf] from Ted Faber’s [USC LUG](http://linux.usc.edu/) talk.

There a couple minor things to watch out for:

* If you use WPA (which you should, since WEP can now be broken in [less than a minute](http://www.schneier.com/blog/archives/2007/04/breaking_wep_in.html)) you need to install an additional package, nas. OpenWRT has a simple package management system, ipkg, to do this for you.

* No, the web interface does *not* have port forwarding configuration. Instead you need to use the slightly more complicated, but much more powerful, iptables. This involves editing a configuration file.

* The only account set up by default is the root account, so when trying to log in you must use “root” as the user, not “admin” as with the original firmware. Be sure to set your password immediately after installation.

See the [configuration documentation](http://wiki.openwrt.org/OpenWrtDocs/Configuration) for details on these things and a lot more.

If you want to dabble with Linux and learn a little about networking (and you have a [compatible router](http://wiki.openwrt.org/TableOfHardware)) I highly recommend trying out OpenWRT.