There are a million people in the world who want to tell you how to act. What the principles of effective life are, and crap like that.
Case in point.
The real work is happening in your brain and practically every other place that’s not an inbox. Stop allowing yourself to be brow-beaten by the latest, loudest, or most dramatic item that’s landed in your world.
The problem is, this is patently not true.
Randomly wandering around the internet, nay, pointlessly, obsessively, addictively wandering around the internet is productive. People who think that they will make themselves more efficient by not wandering around pointlessly on the internet are kidding themselves. People have an amazing ability to sort signal from noise.
But the thing is, the more noise there is, the more signal you get.
This is what the “efficiency” crowd doesn’t want to admit, because it means that their systems aren’t more productive than obsessive wandering and clicking through toplists.
A few hours ago (I really don’t even know what time it is), I screwing around with some CSS to render parallel text—basically I was looking for good ways to mark up a source text and its translation with HTML and CSS.
In the process, I started randomly sticking in sample texts from the first article of the Universal Declaration of Human Rights in a bazillion languages.
One of those languages was Thai, and I saw that the Thai text wasn’t wrapping correctly.
Ah, that old problem. Thai doesn’t use spaces (well, it does, but… erm… I don’t exactly understand when and why), so browsers don’t know where to break long strings of text. (They usually rely on spaces.)
This was not too far from my mind, because a few nights ago, while randomly looking through the web pages of NLP courses, I found this one at Stanford, which had a really interesting paper on Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences (pdf). The problem is similar in Japanese, of course, so my experience with miswrapped Thai immediately made me wonder whether the (very successful) technique from that paper could be ported to Thai.
But before I started looking into that (probably by trying to implement the paper’s algorithm for Japanese, to start with), I figured I would… wander around aimlessly a bit more googling for anything related to text wrapping and Thai.
So I started thinking of terms to lookup. One thing that popped into my mind was the name of a guy who goes by “bact” on Wikipedia. So, totally randomly, I googled: thai bact.
Look at the first hit:
Thai Words Separator :: Mozilla Add-ons :: Add Features to Mozilla …1
Thai Words Separator is an extension to fit thai words in webpage layout without … This implementation developed from bact’ (http://bact.blogspot.com/) …
And a few clicks away from that, Bact’s public domain ThaiWrap bookmarklet.
That little piece of code has some a very original and useful approach to solving my CSS text-wrapping problem. But that’s not all, it’s another piece of the puzzle that could play a role in the much more critical problem of probabilistically splitting Thai (and Japanese, and Khmer, and…) text into words.
That’s a serious problem for Blogamundo, a real problem for which we have to find a solution, or at least, an approach.
And I got closer to a solution by just wandering around aimlessly.
Yes, I stayed up all night. Yes, I ate half a box of Triscuits.
And you know what? It was pretty damn productive.