Archive for Javascript

Javascript Mailing List?

There’s a Rails mailing list and a bunch of Python mailing lists and small industry of Perl mailing lists and so on and so forth.

So where’s the Javascript mailing list? Am I just missing it? Because it seems like something that would be useful, what with all the webappishness going around and unobtrusivity and all that.

Bueller?

Comments (2)

Sitepoint’s CSS and DHTML Books

I’ve recently become a fan of Sitepoint’s books on programming. They’re very cleanly put together, and generally speaking seem to be quite up to date. Here are a couple of titles I went ahead and took the plunge on:

HTML Utopia: Designing Without Tables Using CSS
I like this book quite a bit. The CSS reference in the back is almost worth the price of admission… there are references online (duh) but I guess I’m just still a sucker for paper. There’s a lot of useful info on styling text, which turns out to have more tricks available than I’d ever heard of. One thing about this book that annoyed me intensely was in chapter 6, “Putting Things in Their Place,” when he gives a Javascript solution to the problem of getting columns to flow to equal heights. Admittedly, he gives an alternative, but there are a lot of pure CSS solutions to this problem out there, and one would think that if there’s a reliable one out there, that this would be the book to find it. So yeah, that bit rubbed me the wrong way.
DHTML Utopia: Modern Web Design Using JavaScript & DOM
I’ve been looking forward to this one for quite a while. At that link you can get the first four chapters for free. To be honest, I debated whether to buy the book, because judging from the table of contents, it seems that most of the stuff that I had doubts about was in the free sample chapters. But I’m a big fan of the author and editor: Stuart Langridge through the ridiculously awesome LugRadio (or listen on Odeo) and Javascript/Python guru Simon Willison. So in the end I felt pretty good about picking up a copy. Haven’t started digging in yet. One nit to pick: forty smackers is a lot to ask for a book that’s just 300 pages. Not saying it won’t turn out to be worth it in the end, but dag.
update… The sample chapters are available as HTML now: DHTML Utopia: Modern Web Design Using JavaScript & DOM. I can’t seem to get the example from this chapter to work, though, can you?

All this DHTML stuff is surprisingly fun. And I’ve mentioned before that Javascript has the right policy on Unicode, which makes me pretty happy.

Like this ☞ ☺

Especially considering the headache that is dealing with multibyte stuff in just about every other scripting language. Which makes me kind of sad.

☹ ☜ Like that.

Comments

On-the-fly ASCII to Unicode Transliteration with Javascript?

Here’s an interesting little script I found on the Reta Vortaro (that is, the Esperanto web dictionary).


anstataŭigu cx, gx, …, ux

Try typing the string jxauxdo in that box. And press “Trovu”, if you like, that will search Google for ĵaŭdo (Esperanto for “Thursday”). Notice that jxĵ and uxŭ “on the fly,” as you type. (Come to think of it, maybe “transliteration” isn’t the right word for this process…)

So, backing up a bit, Esperanto has a few odd characters in its orthography:

Letter Pronunciation (IPA) Unicode x-system
ĉ [ʧ] U+0109 cx
ĝ [ʤ] U+011D gx
ĥ [x] U+0125 hx
ĵ [ʒ] U+0135 jx
ŝ [ʃ] U+015D sx
ŭ
(as aŭ, eŭ)
[u̯] U+016D ux

Even today those characters are relatively rare in fonts–if you can’t see them I imagine this post may not make too terribly much sense. 8^)

The good doktoro even got a little flak back in the day, for choosing to include such unusual characters in a supposedly universal language. Nowadays, however, they’re all in Unicode–here’s the full info for ŝ, for example:

U+015D LATIN SMALL LETTER S WITH CIRCUMFLEX
ŝ

But pragmatically speaking, there’s still a problem with input. Suppose you are a gold-star-wearing green-flag-waving Esperanto afficionado, and you want to post something on the internet. How do you actually type these characters? The “right” answer is that you install a keyboard layout for the language in question, and you memorize its layout.

This is a pain, of course.

And it’s nothing new: in the (typographical) bad old days of all-ASCII USENET, Unicode wasn’t widely available, and what people would generally do (for many languages, not just Esperanto) was come up with all-ASCII transliteration systems. The “x-system” added to the table above was probably the most popular. It so happens that there is no letter x in Esperanto, so it didn’t cause any massive problems with ambiguity.

So let’s look at the script in question, it’s quite simple:

function xAlUtf8(t) {
  if (document.getElementById("x").checked) {
    t = t.replace(/c[xX]/g, "\u0109");
    t = t.replace(/g[xX]/g, "\u011d");
    t = t.replace(/h[xX]/g, "\u0125");
    t = t.replace(/j[xX]/g, "\u0135");
    t = t.replace(/s[xX]/g, "\u015d");
    t = t.replace(/u[xX]/g, "\u016d");
    t = t.replace(/C[xX]/g, "\u0108");
    t = t.replace(/G[xX]/g, "\u011c");
    t = t.replace(/H[xX]/g, "\u0124");
    t = t.replace(/J[xX]/g, "\u0134");
    t = t.replace(/S[xX]/g, "\u015c");
    t = t.replace(/U[xX]/g, "\u016c");
    document.getElementById("q").value=t;
  }
}

Include it with something like:

< script type="text/javascript" src="http://example.com/translit.js"> < /script > 

And the function gets called with an onkeyup="xAlUtf8(this.value)" inside the input tag.

(Using onkeyup is actually sort of verboten these days–it should be done with unobtrusively, etc.)

So anyway, that’s a pretty interesting way to enter some unusual characters. It’s interesting to muse on just how far one could take this approach. Would it be possible to create a script that would handle an entire writing system? Say, a script that would convert an entire textarea from an ASCII-based transliteration to Unicode characters, on the fly? Japanese and Chinese are definitely excluded from this approach (every Chinese character in RAM? Er, no.) but people who use those languages generally already have keyboard input taken care of.

That would be neat: you could, for instance, have textareas where users without keyboard layouts could input something in Amharic or Persian or whatever without having the keyboard layout actually installed.

But as it stands, it’s just simple substitution, and no string which is to be substituted can be a substring of another such string. In order to handle a more generalized set of substitutions, you’d probably need to use a Trie structure. (nice trie implementation in Python by James Tauber. )

I’m sure there are complications that would arise from what’s called “font shaping” — that is, how operating systems combine adjacent characters. In Arabic or Thai, for instance, characters vary depending on which characters they’re adjacent to. How does this process affect text in textareas, for instance, or text which is mushed around with Javascript?

I’ll be playing around with this.

Comments (2)

Getting a head(ing) with XPath

I was looking around the Mozilla XPath Documentation because I wanted to write a bookmarklet to lists headings in pages (I’m lazy like that). And low and behold, the first example I found did (almost) just that. Apparently you do something like:


var headings = document.evaluate("//h2", document, null, XPathResult.ANY_TYPE,null);

But that doesn’t really answer my question about how use XPath to get all the headings into that headings variable .

Digging around on the same site, I discovered a Firefox Extension which does everything I had in mind and then some: Document Map. It produces nifty outlines like this.

But that still doesn’t scratch the itch, of course — how do I get the XPATH to do what I want? So I asked my homie Jonas (er, can you have a homie in another country?), and he found this (in Java documentation, of all places):

Finding Elements by Absolute Location in a DOM Document Using XPath (Java Developers Almanac Example)

Which has some nice examples. Anyway, here’s the answer, apparently:

XPath 1.0 does not support regular expressions to match element names.
However, it is possible to perform some very simple matches on element names.

    // Get all elements whose name starts with el
    xpath = "//*[starts-with(name(), 'el')]";  // 2 3 5 7 8 9
    
    // Get all elements whose name contains with lem1
    xpath = "//*[contains(name(), 'lem1')]";   // 2 8

So I guess this is the answer:

xpath = "//*[starts-with(name(), 'h')]";  

Assuming that there aren’t any other tags that start with h. Which is a dumb assumption. Er… are there any? XPath syntax is a little nutty-looking, if you ask me, but I guess it just takes some getting used to.

But whatever, for now problem solved.

UPDATE
Claus Wahlers suggested two better alternatives:

//h1 | //h2 | //h3| //h4 | //h5 | //h6

Which simply “or’s” together possible heading tags, and the rather more wizardly:

//*[contains('h1h2h3h4h5h6',name())]

What that says is “Return any element (*) which returns true for the condition that the tag’s name can be found within the string h1h2h3h4h5h6.”

That rules out the silly errors my first statement had, like including
html tags or hr or head tags. Duh.

But wait there’s more:

To really learn XPath:

Mark Pilgrim’s Dive into Greasemonkey also has a list of further reading links, one of which points to this XPath Tutorial by Example. It’s already been translated into seven languages, so I guess it must not suck. = )

UPDATE
Also good (even thought it’s rather infested with ads): XPath Tutorial

Comments (3)

Bookmarklets are Fun

There has been much ado about Greasemonkey lately, and rightly so. Mark Pilgrim even wrote a book about it. I’ve found that Greasemonkey is a good way to learn a little Javascript and a bit about how to monkey around with the DOM.

That sounded kinky.

Anyway.

It seems to me that the way to learn Javascript is just to continually learn little tricks, bit by bit. (I’ve not been able to find any readable books that try to take a more high-level approach, although this essay is pretty informative.)

So here’s one I learned tonight, so I’m writing it down to remember. It’s very simple.

In css, if you want to give the body of your document margins, you can use a rule like this:

body { margin-left: 25%; margin-right: 25%; }

(Yes, I know, there’s some kind of shorthand “clock” notation for css margins, but I can never remember how it works. You get the point.)

It turns out that assigning a css rule in Javascript is quite easy:


document.body.setAttribute('style','margin-left:25%; margin-right: 25%');

I believe that document.body is unusual in that it’s sort of a “built-in” reference — if you wanted to a apply a style rule to any other node (say, a div), you’d have to get a reference to it first and then call .setAttribute on the reference.

But that’s all there is to it: the second argument is a string with whatever css you want to apply. And here it is as a bookmarklet:


<a href="javascript:document.body.setAttribute('style','margin-left:25%; margin-right: 25%');">make skinny</a>

try it → make skinny

(I should add, by the way, that I usually use this thing on “print” versions of news articles, like this one.)

update…
Jesse Ruderman gave me some corrections for this code. The correct way to adjust the style of an element isn’t to use
document.body.setAttribute('style','margin-left:25%;margin-right:25%');
Rather, one sets the style directly, like this:

 document.body.style.marginRight = "25%"
 document.body.style.marginLeft = "25%"

Note that the positioning is notated in camel case, from margin-left to marginLeft.

So here’s a better version:

<a href="#" onclick='javascript:document.body.style.marginLeft="25%";document.body.style.marginRight="25%"'>make skinny</a>
 

try it again → make skinny

argh… I managed to break this, for the moment… fix upcoming.

Comments (2)

Why does Javascript have the Reputation of Sucking? A Lot?

You know, after having read a fair amount about Javascript as a result of all this Ajax stuff, I have to say I feel sort of conned. Realization du jour: “Oh, Javascript is a language, not a scriptkiddie popup machine.”

I think one reason has to do with its association with web developers instead of “real” developers — people who write desktop applications, you know, in C or something. I freely admit that I have little interest in learning C or C++ or, frankly, any other compiled language. I’ve actually tried a bit of C++, and I hated it. Honestly, what good is learning a compiled language to me, when I can build good enough networked applications with XHTML, CSS, a little server side PHP or Python, and Javascript?

You tell me…

Comments

Programming in the browser…

Getting Unicode straight across platforms has been a huge hangup for me in trying to get together some tutorials on doing language processing with Python. And then, there’s another barrier to cross: how to deal with markup?

Generally speaking, what I’m interested in dealing with is text, but most multilingual text on the web is HTML.

One weird observation that keeps occurring to me is that you could teach text processing without teaching people to deal with setting up a programming environment at all: use Javascript.

This seems a little weird, but I think the reason that it seems weird is because people who work with text processing have never thought of Javascript as a real language. But it is a real language. And the barriers to programming in Javascript are incredibly low. (Go type javascript:alert('hello world!') on your address bar to see what I mean.)

And then, I was reading through some stuff on Crockford.com, and I came across this:

String is a sequence of zero or more Unicode characters. There is no separate character type.

Good grief! Music to my ears!

And as for dealing with HTML, well, Javascript has that abstraction built in. Try explaining to a newbie how to extract the text from an HTML page in Python. “Well, you start by subclassing a parser and…” Javascript is designed for a browser; and browsers are where all that markup stuff comes from in the first place: to turn a css rule into “put this text in a blue box in the corner,” the “text” bit is a given.

Of course, it still looks like C — or at least, certainly not as friendly as Python, but I have to say, combining these characteristics with Greasemonkey open up some very interesting possibilities… input/output becomes “go to this url.” Process the text becomes “Paste this Greasemonkey script into the editor and run it — the result will be investigate character distributions/statistical language id/sentence splittling keyword extraction/blah blah blah….”

Is it crazy to think that such things can be done in a learnable way with Javascript? I don’t think it is…

I’m just thinking out loud. But lately I’ve been thinking about all that Ajax stuff (and rolling it into my present project), and it’s gotten me thinking about the browser as a place to do programming. Kind of blue sky, yes, but certainly a fun angle on the topic of processing natural language.

Comments (1)