Friday, September 13, 2013

Supporting multiple languages

This is much more technical than is needed for a discussion about Open Source Software & Usability, but since having a program operate in your preferred language affects software usability, I thought I'd briefly explain how software supports multiple languages. Adding support for different languages is easy. If you are a programmer working on an open source software project, I encourage you to become familiar with these or similar methods. By including different languages, you increase your program's audience, and you find new ways for people to contribute to your project (by translating your program into other languages).

Most programming languages provide some kind of support for multiple languages. The most common is the catgets method used in the C programming language. The "cat" in catgets refers to a "message catalog," and the "gets" refers to "get a string from that message catalog." You can think of a "message catalog" as a list of all the messages that a program will need to print out, and a "string" is just a message.

I've previously mentioned that I do a lot of work in open source software, much of it in the FreeDOS Project. I wrote a version of "catgets" for FreeDOS, so I think I can explain how catgets works:

Programs first need to open a "message catalog" before they can use it. This is done using the catopen function. The catopen function figures out what language you use and opens the right message catalog for that language. So programs have a different message catalog for each language they support.

The catopen function returns with a special code that the program can use to refer just to that "message catalog." A large program might use several message catalogs. But to keep things simple, I'll show you with just one message catalog.

Once the program has opened the message catalog, it can get messages from that catalog using the catgets function. For example, if the program wants to say "Hello" it gets that message (called a "string") from the message catalog. Each message (or "string") has a specific number assigned to it—actually, it has two numbers assigned to it, one number for the "message set" and one for the "message" itself. A program might put all error messages into one set ("1") and all information messages into another set ("2"). Let's assume the message "Hello" might be message 1 in in the "information" message set ("2").

Since catopen already figured out what language you want to use, catgets will always pick messages from the right language.

When the program is done, it needs to close the "message catalog" using a function called catclose.

To a programmer writing a simple program called "Foo," it all comes together like this:
catalog_number = catopen("Foo", MCLoadAll); /* You don't need to know what "MCLoadAll" means for this example */


string = catgets(catalog_number, 2, 1, "Hello"); /* In case "catgets" can't find message 1 in set 2, it just gives you "Hello" */

puts(string);


catclose(catalog_number);
And somewhere, the program has separate "message catalogs" for English, Spanish, German, … and so on.

Creating the message catalogs is fairly straightforward. Under catgets, this is actually a two-step process: write a file containing all the messages that your program will use, then use gencat to convert this into a message catalog that can be easily opened by catopen.

For example, a message file (foo.msg) might look like this:
$set 1
1 Error reading file.
2 Abort!

$set 2
1 Hello world!
2 I like pie.
In that file, error messages are in one set ("1") and all information messages are in another set ("2"). So message 1 in set 2 (information) is "Hello world!"

At the command line, the programmer can convert this message file ("msgfile") into a message catalog ("catfile") using the gencat command:
gencat catfile msgfile

No comments:

Post a Comment