Saturday, October 8, 2011

Customizing dict, or Offline dictionary from command line in Ubuntu

This weekend I spent quite some time setting up the dictionary in Ubuntu. My goal was: being able to easily get a translation between the languages I want, and if possible, offline.

I already knew that there is a Dictionary program, which is available in Ubuntu by default or can be easily installed, plus a gnome-dictionary plugin to easily invoke it from the top panel. I had two main problems with this:
  1. My native language, Russian, was not available by default as a target language.
  2. I wanted to be able to use the dictionary also when I am offline (which can happen when one has a laptop).
In addition, I would really love to use the dictionary without a client, just from command line.

How I went solving this problem:
  1. It appears there are better clients for Ubuntu, supporting both dict and other formats. I used StarDict before (plugging in extra dictionaries found elsewhere in the internet) but its interface is rather messed-up, at least for me (while I was on Windows, I used Babylon and it was practically just what I needed). On Ubuntu, recently I discoveblue Fantasdic and it seems to be at least much neater than Stardict; plus, it can itself import dictionaries in other formats (StarDict among them), so it was already an improvement.
  2. Then I found out that it's possible to install local dictd server and let it provide the dictionaries.
After some experiments and poking around the net, I have got dictd up and running and could import some extra dictionaries to it. Here are the steps, meant more as an inspiration than as a cookbook :)
  • You can get an idea what is available from the repos you use by typing something like:
  • apt-cache search dict- As a result, you will see something like:
    dict-jargon - dict package for The Jargon Lexicon
    dict-freedict-afr-deu - Dict package for Afrikaans-German Freedict dictionary
    dict-freedict-iri-eng - Dict package for Irish-English Freedict dictionary
    dict-freedict-tur-eng - Dict package for Turkish-English Freedict dictionary
    dict-freedict-wel-eng - Dict package for Welsh-English Freedict dictionary
    stardict-common - International dictionary - data files
    stardict-czech - Stardict package for Czech dictionary of foreign words
    stardict-english-czech - Stardict package for English-Czech dictionary
    Most packages starting with "dict-" will be the dictionaries in the dictd format. The naming scheme, though, is not strict (for example, English-Russian Mueller dictionary is called mueller7-dict and Moby Thesaurus is called dict-moby-thesaurus) but you get the idea. Otherwise, you can just look them up in Synaptic package manager.
  • To install dictd and the additional packages, you can either add them via Synaptic package manager or just apt-get them:
  • apt-get install dictd dict-gcide dict-wn dict-moby-thesaurus [whatever else dictionaries you want]
    Additional dict packages can always be added later.
  • If installation succeeded, you will have your dictd service up an running locally on port 2628! You can check it by typing:
  • /etc/init.d/dictd status
    You should get:
    * dictd is running
    Also, now you can type something like:
    dict athwart
    and get results:
    6 definitions found
    From The Collaborative International Dictionary of English v.0.48 [gcide]:
    Athwart \A*thwart"\, prep. [Pref. a- + thwart.]
    1. Across; from side to side of.
    [1913 Webster]
    Athwart the thicket lone.             --Tennyson.
    [1913 Webster]
    2. (Naut.) Across the direction or course of; as, a fleet
    standing athwart our course.
    From Mueller English-Russian Dictionary [mueller7]:
    1. _adv.
    1) косо; поперёк; перпендикулярно
    2) против; наперекор
    2. _prep.
    1) поперёк; через; to run athwart a ship врезаться в борт другого судна;
    to throw a bridge athwart a river перебросить мост через реку
    2) против; вопреки; athwart his plans вопреки его планам
    If you want to use the client (like Dictionary or Fantasdic) you can set up your local dictd server as the source there: in preferences, add new source of type "DICT dictionary server", specify "" as the server address and leave the port number unchanged (2628).
  • In the previous example, I have cheated a bit: you will get less results, because I have put in a couple of additional dictionaries already, converted into dictd format. The reason for hat was that not all dictionaries I needed were available in dictd format, but they could be found in other formats (stardict, sdict, dsl): for example, look here or here (I suspect that the first list is just the combination of all entries from the second one, not sure).
  • The second link also points to the home of XDXF project, where you can get a program called makedict to convert the dictionaries between different formats. This program is not available in the binary form to install, so you can clone the source and build it yourself with standard steps:
    cd [someplace]
    mkdir xdxf
    svn co xdxf
    mkdir makedict-out
    cd makedict-out
    cmake ../xdxf/trunk
    make install 
    After this, you can convert the dictionaries (at least in sdict, stardict and xdxf formats - haven't tried the others) to dictd format using
    makedict -o dictd file-name
  • Finally, a couple of import examples.
    1. For example, suppose you have downloaded English-German dictionary.
    2. You will get a file comn_sdict_axm05_English_German.tar.bz2 in bzip format, and can proceed as follows:
      tar -xvjf comn_sdict_axm05_English_German.tar.bz2
      So, this is an xdxf format. We don't have to specify it explicitly, specifying output format is enough:
      makedict -o dictd English_German/dict.xdxf 
      Write index to English_German/English_German/English_German.index
      Write data to English_German/English_German/English_German.dict
      The resulting two files have to be put together with other dictd files (on my machine they dwell in /usr/share/dictd folder by default), the dictd config should be updated and the dictd service should be restarted: mv English_German/English_German/*.* /usr/share/dictd /usr/sbin/dictdconfig --write /etc/init.d/dictd restart Now you should be able to see new dictionary in your client or just check its availability from the terminal:
      dict --dbs
      It will provide a list which should contain the new source (usually named after the file name).
      Databases available:
       gcide           The Collaborative International Dictionary of English v.0.48
       wn              WordNet (r) 3.0 (2006)
       English_German  English_German
       fd-eng-fra      English-French Freedict dictionary
       rus_eng_full    rus_eng_full
      And it should just work:
      dict athwart
      From English_German [English_German]:
      (Yes, there might be some specific tags which don't look pretty from terminal; they can be removed if needed - the file is just plain text - but that's outside of the current topic). According to the Wiki article about Dict, there is another program formatting text files into .dict and .index files, called dictfmt. I tried using it to format a file in text format generated from page, but the format of these text files does not seem to be what dictfmt expects. I didn't spent much time on it yet.
    3. The procedure has an additional extra step for the files in stardict format, for example Dutch-English one.
    4. After unpacking the file, we get the following structure:  dutch-english.idx  dutch-english.ifo
      The converter will complain, because it expects non-compmressed dict file. The additional step is uncompressing:
      dictzip -d
      Which will give us:
      ls stardict-dutch-english-2.4.2
      dutch-english.dict  dutch-english.idx  dutch-english.ifo
      makedict -o dictd stardict-dutch-english-2.4.2/dutch-english.ifo
      Write index to stardict-dutch-english-2.4.2/dutch-english/dutch-english.index
      Write data to stardict-dutch-english-2.4.2/dutch-english/dutch-english.dict
    5. Another caveat is the index. If the index entries contain anything else than words (lexical definitions, hyphens, etc), then these entries won't be matched with a default search, but can be matched using a different search stragegy.
    6. dict -d English_German ceiling
      1 definition found
      From English_German [English_German]:
        Höchstbetrag {m}, Obergrenze {f}, Zimmerdecke {f}
      dict -d English_German -s suffix ceiling
      From English_German [English_German]:
        <k>(absolute) ceiling<k>
        Gipfelhöhe {f} (Luftfahrt)
      From English_German [English_German]:
        <k>asset ceiling<k>
        Höchstgrenze {f}
      From English_German [English_German]:
        Höchstbetrag {m}, Obergrenze {f}, Zimmerdecke {f}
That's it for the start! Might not look extremely fancy, but... it's a free horse after all :)

UPDATE: if you have Babylon dictionaries (.BGL), you can convert them into dictd format using (available from Ubuntu distro) program called dictconv.

Saturday, July 2, 2011

Ubuntu, renaming files recursively

As a result of researching how one can quickly (in one-liner) rename, on Linux (=Ubuntu in my case) all .JPG files in a directory tree to .jpg files, the following solution was found (borrowed from a discussion here after fixing the typo's :) ):
 find . -name *.JPG -exec rename 's/\.JPG$/\.jpg/i' {} +
It won't work on Mac though (a Mac user offered this one):
for i in `find . -name '*.JPG'`; do mv $i ${i/%JPG/jpg}; done

Thought it might be worth remembering :)

Tuesday, May 3, 2011

Ubuntu 11.04, VMWare Player and coolah scrollbars

After feeling myself very miserable, because VMWare Player was "crashing" after trying to play any virtual machine on Ubuntu 11.04 ("crashing in the sense that it was still running in the background, but the screen was gone), I accidentally found some helpful info where somebody with the similar problem mentioned that he had to downgrade overlay-scrollbar package from version 0.1.9 to 0.1.7.

This solution didn't work for me (because I had fresh Ubuntu 11.04 install), but after starting Synaptic Package Manager, I could see that there is a newer version of overlay-scrollbar package available (0.1.12) and decided to give it a try. Miracle happened - VMWare Player started to work normally after I performed the upgrade. (It took several hours of trying to find out what was happening, and I strongly dislike those new fancy scrollbars - now even more so!.. :) )

PS I have also found one other mentioning of VMWare problem with the latest Ubuntu here, but their solution didn't seem to help.

Wednesday, March 9, 2011

PHP: self versus this

If you haven't thought about it yet, self in PHP refers to class methods visible for the particular class, whereas this refers to the object methods visible for the instantiated object.

Simple example:
class a {
    function first() {
        echo 'a:first ';

    function second() {
        echo 'a:second ';

class b extends a {
    function first() {
        echo 'b:first ';

    function third() {
        echo 'b:third ';

$objB = new b();

This will print:
b:third a:second a:first
because self does not care about the object instance.

If you replace, in a::second, the line
with the line
the output will change to what you might have expected in the first place:
b:third a:second b:first

(As observed for PHP 5.3.3)

You have been warned :)

Sunday, February 27, 2011

Python quirks

Saw for myself a funny quirk in Python (2.6).

Suppose you want to create a list of dictionaries, which will be populated later.

My first idea was to do this:
 >>> a = [{}]*3

At the first glance, it works. But just look at this:
>>> a[0]["bbb"]=1
>>> a
[{'bbb': 1}, {'bbb': 1}, {'bbb': 1}]

Not exactly what you would expect, heh.

The way to do it right seems to be more wordly:
>>> a=[]
>>> for i in range(3):
... a[i]={}
>>> a
[{}, {}, {}]
>>> a[1]["bbb"]=1
>>> a
[{}, {'bbb': 1}, {}]

By the way, initializing a list with None is not a problem:
>>> c = [None]*5

Nevertheless, this will not work:
>>> for e in c:
... e = {}
>>> c
[None, None, None, None, None]

This will:
>>> c[2]={}
>>> c[1]={}
>>> c[2]["aaa"]=3
>>> c[1]["bbb"]=4
>>> c
[None, {'bbb': 4}, {'aaa': 3}, None, None]
>>> c[1]["aaa"]=5
>>> c
[None, {'aaa': 5, 'bbb': 4}, {'aaa': 3}, None, None]