Saturday, September 4, 2010

Idioms - Using Colorful Language

Pie in the sky. In a New York minute. On the other hand. Costs an arm and a leg. In the black. Mad skills. All Greek to you? To a non-native English speaker, common idioms like these are often challenging, since their meaning is only loosely tied to the words that are used. In the same way, most programming languages have idioms that can look confusing to someone first learning the language, but which are used to perform common tasks.
Perl has many idioms. Here is a common one for "slurping" a file, or reading the entire file into a single variable (rather than the default of reading one line per "readline" call)

{ local $/; $contents = <$filehandle> }

This code introduces a new scope block with curly braces, and declares the Input Record Separator, $/, local to that block, which makes its value undef instead of a newline. Then the <> operator is used to read a "line" of text from the filehandle, which turns out to be the entire contents of the file (from the current position in the file, of course.) The diamond operator itself is rather like an idiom, being a slightly magical way of saying readline $filehandle. The closing curly brace ends the scope block and returns $/ to its previous value.

Here's an idiom from Python that is often used in modules:

if __name__ == "__main__":
    main()

The idea here is to define a special behavior for when the module is used as a script, rather than being imported. When the file is imported, __name__ will be set to the name of the module, and this block will not run. When the file is used like python filename.py, however, the condition will be true, and the main function will be called. This is a convenient way to make a dual-purpose program that can be included as a module or run on its own. It could also be used as a place to put module tests.

C has lots of idioms to choose from, but here's the most recent one I came across:

struct myStruct {
    int num;
    char array[1];
};

struct myStruct *item;
size_t length = 10;
item = (struct myStruct *)
    malloc( sizeof(struct myStruct) + sizeof(int) * (length - 1) );
item->num = 3;
for (i=0; i < length; ++i) {
    item->array[i] = item->num + i;
}

This code isn't portable due to compiler differences, but it definitely works with Microsoft Visual Studio 2005. Essentially, declaring a struct with a one-element array at the end lets you allocate a struct with a variable-length array element. The trick is to never declare an instance of the struct, but instead use pointers and allocate dynamic memory. Since this idiom is fairly common, the C99 standard defined a way to declare flexible array members by leaving out the array length, like so:
struct c99struct {
    int num;
    char array[];
};
This form is guaranteed to work in C99-compliant compilers (a set that does not include Visual Studio 2005).

Just a few idioms to get you started. I find the best way of learning new idioms in a programming language is reading other people's code and looking for parts I don't understand.