You are doing an English - Welsh translation. You want to know how to say "youth hostel" in Welsh. You type "youth hostel" into a search box, and get back pairs of English and Welsh sentences where the source sentence contains "youth hostel." The would rock, right? From there you can figure it out. You know Welsh, after all, you just don't happen to know that phrase: the pairs of sentences get you far enough. The premise is that many linked Wikipedia articles contain partial translation: often, English into some other language, but sometimes another language into English, or between some other pair of languages. So it would be nice to have pairs of aligned, translated sentences. The paper describes a method of using the links in the articles to detect which pairs of sentences of all the sentences in two connected articles are likely to have been translated. So for instance, presumably there are some translated sentences between http://am.wikipedia.org/wiki/ኢትዮጵያ and http://en.wikipedia.org/wiki/Ethiopia . The way the paper recommends finding those pairs is to look at the links in all the sentences Because it's already possible to build a huge list of linked terms (I've already done that, as a matter of fact). It's just a matter of parsing up the Wikipedia dumps, extracting corresponding articles, extracting sentences, creating a data structure consisting of a sentence and its links, and then using the link lexicon to match those sentences with similar link lists. Steps (note that the terms "source" and "target" here are actually somewhat arbitrary): 1. Given source and target language codes, and an article name in the source language, spider the article about that topic in that language. 2. Extract the page content (as Wikitext) of that article. 3. Look for the interwiki link to the corresponding article in the target language. 4. If it exists, repeat 1 and 2 for that article. 5. Extract sentences from both pages. 6. For each sentence on both sides, extract all links. 7. Using a link lexicon, annotate the list of links with bilingual links. 8. Rank the similarity of the link lists.