infundibulum

Translation watch…

August 25th, 2005

I’m subscribed to some news feeds that send me updates with articles about translation, and here’s the latest one I came across:

Speaking the same language

Police are reaching out to migrant communities, with information in seven languages now available on the force’s national website.

Chinese, Arabic, Hindi, Japanese, Korean, Somali and Vietnamese speakers can now access police information online.

The site which was translated by NZ Translation Centre explains how to contact local police and liaison officers as well as giving tips on crime prevention and safety tips.

So I took a look at the site itself:

New Zealand Police Official Website

All said, it’s a pretty nice site — they’ve done a good job localizing it. One interesting bit, however: the character encodings aren’t consistent.

Arabic UTF-8
English ISO-8859–1 (Latin-1)
Hindi UTF-8
Japanese Shift_JIS
Korean EUC-KR
(Simplified) Chinese UTF-8
Somali UTF-8
Vietnamese UTF-8

I suppose there are compelling reasons to use those legacy encodings for Korean and Japanese — but it really doesn’t make sense to encode English as Latin-1, when the same site is using UTF-8 for a language like Somali, whose alphabet is strictly “roman” characters.

It seems to me that they’ll be looking at more headaches down the road as a result of not just going ahead and serving the whole site in a single encoding.

I Bet You Didn’t Make Any Money…

August 25th, 2005

Here’s an update to a random idea I had a while back: Want to Know How to Make Some Money?, where I babbled:

Want to Know How to Make Some Money? Here, I’ll tell you.

News Sentinel | 06/24/2005 | Funding cut for translator service

Asterisk
+ Wireless network + Laptops + Webcams + Subscriptions + Nationwide
(Worldwide?) network of on-call interpreters for lots of languages.

Well, go on.

The idea being that one could start a business capitalizing on the relatively cheap availability of video conferencing tools to sell distributed interpretation services.

Well, I talked to my sister about this idea. She’s a nurse.

The concept is D.O.A., and here’s why: there are strict rules about how the interaction between doctors, patients, and interpreters are to take place. Specifically, the interpreter is not allowed to be a “participant” in the conversation: the interpreter must not speak directly to the patient. The patient looks only at the doctor, never at the interpreter.

That’s a rule.

Which obviates the whole point of the webcam idea. Perhaps the VOIP aspect would still be doable, however.

Javascript Mailing List?

August 25th, 2005

There’s a Rails mailing list and a bunch of Python mailing lists and small industry of Perl mailing lists and so on and so forth.

So where’s the Javascript mailing list? Am I just missing it? Because it seems like something that would be useful, what with all the webappishness going around and unobtrusivity and all that.

Bueller?

☞ update

Steve Clay made a recommendation on the jQuery list about a general Javascript mailing list:

Javascript Info Page

I just signed up, we’ll see!

The Tiny Hole Theory of Application Development

August 24th, 2005

Remember in Star Wars where they’re going into that trench to blow up the Death Star? “Red Five, goin in!” and all that.

And they keep saying all these great lines like “Stay on target! STAY ON TARGET!”

The whole point, of course, being that you have to get the missile into that one little “duct” at the end of the trench. Into the tiny hole.

I mention all this, friends and neighbors, by way of introducing my new and improved theory of application development:

Doesn’t work.
“I’m going to finish this interface today if it kills me! Kowabunga!”
Works.
I’m going to think about nothing but writing this little function that changes the background color of that span over there in the corner of the interface.

At least for me, it’s all about tiny holes. TINY HOLES PEOPLE, TINY HOLES.

I hope I don't get sued for this Star Wars screenshot.

GTD is Creepy

August 15th, 2005

Is it just me, or is there something a little creepy about the whole “Getting Things Done” trend?

Actually, I bought the book, and I read it, and I even tried THE SYSTEM for a bit. It kind of worked. But I didn’t feel any huge weight lifted from my shoulders. It was more like “okay, well, I paid my bills on time instead of two days late.”

But there’s this creepiness about the whole message of GTD, which basically says “stop thinking, start doing.” I can sympathize with that, sort of.

It’s still creepy.

The author’s site has a page with a list of lists that you should keep handy, and includes this:

Affirmations — personal self-talk scripts for positive internal programming

It’s well known that geeks have taken to GTD like fish to water. And if you Google for the word “programming” on davidco.com you’ll find a lot of Agile programming acolytes converts introducing non-programming people to the commandments tenets of Agile programming.

But check out this other, creepier sense of the word as used on the discussion boards:

You should invest in the finest leather that you can afford. So that everytime you open it, you feel a sense of pride and prosperity…you are programming you mind for prosperity. Then invest in a very expensive pen..like a Montblanc. Again, you are programming your mind that time management is highly valuable and that you are properous.

The guy claims not to work for Montblanc. Whatever. Sounds to me more like he works for Scientology.

Blech.

Now, if you’re prone to pessimistic extrapolations, you might be reminded of the connection between busybodyism and fascism described in Quitting the Paint Factory. But if I pointed that out, you’d just say such a connection was utterly ridiculous, right?

In which I ask You

August 13th, 2005

Why don’t word processors keep the cursor in the middle of the page?

Good grief.

Down arrow, down arrow, down arrow, up arrow, up arrow, up arrow, edit. Cursor goes down to the bottom of the page. Repeat. Get mad. Repeat.

English-Only and Ridiculous English as a Second Language Tests

August 7th, 2005

Here’s a story I found pretty frustrating:

Democrat & Chronicle: Students speak out on ‘too easy’ — English test State exam for foreign-born called ‘weak’

In Austin, Texas, students were complaining about how moronic their ESL (English as a Second Language) exams were. I actually went and found the sampler for the test that the article refers to: NYSESLAT Test Samplers-2005. And it is pretty shocking that they’re giving high-school age students “questions” like this one:

So the deal is, if the students pass the test, they pass out of the ESL program. And unsurprisingly, the number of students in the program determines how much funding there will be:

Districts routinely go out of their way to help students who have completed the ESL program, even though they no longer get state money to do so. But if the proficiency test proves so easy that many students exit the program, “then it’s going to drastically reduce the state aid we get in the coming year,” said Kim Ganley, who coordinates ESL programs for the Webster district, where students speak about 20 languages.

This is the kind of thing that makes me doubt the motivations of the “English Only” movement. Their argument goes that the only way to really help students adapt is to move them into an all-English environment as quickly as possible. But tests like this are almost pointless. They prepare students for precisely nothing, let alone for moving into high-school level classes. The students are complaining because they know this perfectly well.

If one were really cynical, one might guess that tests like these are actually designed to submarine students’ honest attempts to integrate into the English-speaking world. I’m pretty cynical, myself.

Dear World

August 3rd, 2005

Yahoo Messenger smilies are infinitely superior to Aim smilies.

That is all.

Font Problems with Hindi in Firefox

August 1st, 2005

Debugging font issues is a pain , in my experience. If something isn’t rendering correctly, my first reaction is usually “I have absolutely no idea why that’s happening.” Gentle reader, feel my pain.

I find myself working with an awful lot of languages (you’ll see why when Jonas and I launch our project), and I often have to learn just enough characters to determine that a particular script seems to be rendering correctly. We have to know if rendering problems are caused by some kind of configuration problem that we can fix, or if it’s something out of our control: “Sorry, no hieroglyphics in Unicode, not our problem!”

Debugging such stuff is not the same thing as actually being able to read in all these languages: in most cases it’s enough to learn just a bit about how the script is put together and how characters combine, and perhaps a few words for testing purposes.

So here’s an example of a typical problem that I face. Compare a the two screenshot clips I took this morning. I added the red-bordered boxes to point out the differences:

Even if you don’t know Devanāgarī from a salad fork, it doesn’t take much to guess that something is askew in my Firefox’s rendering of that page. (Never mind the fact that the word “Hindi” is actually spelled incorrectly… Doh!) Opera seems to get it right.

Now I’m not going to get into the details of how Devanagari works in Hindi at the moment (primarily because I don’t know much, heheh). The main problem for me is that there are so many possible causes for any problem in text rendering. Is this a configuration problem on my end, or is it some pernicious software problem buried in a library underneath the text?

  1. The font could be bad.
  2. The browser?
  3. Is it the case that my operating system is missing some library? (Linux, in my case.) If so, what library? Can I upgrade something to fix it? Who ya gonna call?
  4. Or maybe it’s part of my desktop environment? I wonder if it works in that other desktop environment… blech, switching desktops is a pain…
  5. Could it be an encoding problem? Maybe the HTML page is encoded incorrectly in the first place.
  6. Or, maybe their server is futzing up the encoding somehow?
  7. Is it part of that “font shaping” thing, Pango? Am I even using Pango?

nd, but dag.

update…Σμς suggests an eighth potential culprit to this situation: there could be a problem with CSS. He also found a relevant bug in the bug database for Firefox. (See the comments. Thanks, Simos!)

In this particular case, the comparison above leads me to suspect #2, of course. But you get the picture here: these kinds of problems are a mess. Particularly in the open source world, it’s hard to know what to do in this situation. And I’m moderately techie. Imagine what a run of the mill user faces.

I was chatting with Chad Fowler and he made an interesting observation: for the development of any given application, in order to be sure, really sure, that everything is okay for every particular writing system, each development group would have to have someone who can read each language. Which, er, ain’t gonna happen.

And it shouldn’t really have to: the operating system is supposed to abstract the basic rendering of text away from coding.

OSX is pretty darn good at this. But then, it’s also a very closed system: it’s all tested, Apple owns and delivers a wide variety of high-quality (proprietary) fonts with its machines, and there are far fewer points of variation than you’ll see in your average Linux distribution.

Matters in Windows are less variable than Linux, but more complex than OSX, as Michael Kaplan can attest in great detail at his excellent blog.

I think these complexities are makes many programmers reticent about Unicode: they’ve been burned in the past with encoding matters, gotten a glimpse of the gruesome entrails underlying text rendering on their platform, and decided I just don’t have time to really learn how all these text rendering variables fit together.

And quite frankly, despite being something of a Unicode zealot myself, I can sympathize.

Most developers accept that they need to know the absolute minimum about Unicode. They already know that Unicode is good. The thing is, as a previous commenter pointed out, and as this tiny example demonstrates, the “Unicode” part of handling text is only the tip of the iceberg.

And it’s a big iceberg.