Python is my favorite language.
But I have an axe to grind, here.
from xml.sax import ContentHandler
class MyHandler(ContentHandler):
def __init__(self):
This is the typical beginning to using Python’s standard xml.sax module, which implements a SAX parser.
But that’s sort of irrelvant. What I think is confusing is the class syntax:
class MyHandler(ContentHandler):
Call me crazy, but in that line ContentHandler looks like a parameter. Everywhere else in Python, and indeed in most languages on the planet, parens mean “this is a parameter.” It doesn’t mean that at all. It means “MyHandler is a kind of ContentHandler.” So if you want to compose a variation on the theme of ContentHandlers, you write MyHandler(ContentHandler).
As far as how the thing is actually called, it’s SIMPLE: you just look in the __init__ method. Because you see, the __init__ method is sort of the constructor. Not really, but sort of. (There is also a __new__ method, and I really haven’t figured out what the hell that does.)
So when you read as far as:
class MyHandler(ContentHandler):
def __init__
You have to think “to see the way that my ContentHandler will be called, I have to look at what comes after the __init__.” Which is back in the realm of normal, right? Our old friends, the parentheses, who have changed their wayward ways and now really do mean something like “here comes a list of parameters.”
Except, not really.
Well okay yeah really, but the thing is, the first parameter isn’t something that gets passed in when you do the __init__ dance… it refers to “the thing itself.”
It’s “self.” (You can call it anything you want, actually, but you shouldn’t, because to call it anything else would be, you know, confusing.) So the list of parameters there is really only parameter-y from the second argument on. Assuming you have more than one parameter. If there is just self, well… it’s like the self isn’t really there. But you have to have it.
Plain as day.
class MyHandler(ContentHandler):
def __init__(self):
Right, so, let’s review.
MyHandler is a kind of ContentHandler, even though it looks just like a function, and ContentHandler were a parameter. You must banish that intuition because it is wrong.
And then to instantiate the class, you need to have an __init__ (actually, come to think of it, I think I heard somewhere that __init__ is optional, did I mention that? Although people don’t actually seem to leave them out, well… ever). And actually when you initialize the thing, the init isn’t part of the call.
And please, people, two underlines, people, both sides. Kthx.
And you have to unbanish that bit about parameters, because now parentheses mean parameters again.
Except that the first parameter is not a parameter, it is self.
Behold, the self.
Behold, me kicking my SAX parser in the genitals.
Yes, I know that any programmer who is halfway decent will quickly get used to such details and move on with their lives. But I defy anyone to claim that they guessed that this is how all this ever-so-deceptively-simple-looking syntax worked when they first encountered it.
And besides, what good’s a weblog if you can’t whine in it.