Tag Archives: python

Please Help to Port python-debianbts to Python3

Dear Lazyweb,

I’m currently trying to find a way to port python-debianbts to Python3. Debian’s standard bugreport tool reportbug depends on python-debianbts and can thus not convert to Python3 if python-debianbts does not as well. Unfortunately python-debianbts depends on SoapPy for parsing the Debian bugtracker’s responses, and that library is not ported to Python3 yet, and probably never will.

I’m planning to replace SoapPy with pysimplesoap which is available for Python2 and Python3. Unfortunately debbugs does not support WSDL which makes parsing of the replies extremely painful and error-prone. I wonder if there is a ¬†SOAP/Python expert out there who’d be willing to give some assistance in porting python-reportbug to pysimplesoap? python-reportbug’s repository is on GitHub and patches are very welcome.

Since SOAP is quite a beast and debbugs uses it for read-only purposes only, another attractive solution would be to replace/augment debbugs’ API with something much more simple, like JSON. That would make parsing extremely easy as many programming languages including Python support JSON without any external libraries. In theory this could be quite easy but requires some serious Perl skills.

What are the most popular .vimrc options?

Hi always wondered what the most popular options are, you usually find in .vimrc files. So I downloaded 155 .vimrc files from the net (mostly from dotfiles.org and github.com), and wrote a little script which counts the number of times an option has been set. Since most options come in normal- and shortcut form, I mapped the shortcuts to the long version whenever I recognized them.

So without further ado, here are the most popular .vimrc options (without values!). The number specifies the number of times this option has been set. The most popular option is on the bottom:

10 tselect
10 dictionary
10 runtimepath
11 mousehide
11 t_vb
11 foldlevel
11 foldopen
12 suffixes
12 matchtime
12 fileencoding
13 modelines
13 wrap
14 sidescrolloff
14 clipboard
14 lines
14 novisualbell
15 linebreak
15 cursorline
15 fileformats
15 columns
15 cindent
16 undolevels
16 shiftround
16 lazyredraw
16 completeopt
18 modeline
18 whichwrap
18 wildignore
19 list
19 autowrite
19 foldcolumn
19 grepprg
19 titlestring
20 autoread
20 title
21 foldenable
21 cmdheight
22 pastetoggle
23 formatoptions
23 fileencodings
24 tags
24 directory
25 ttyfast
26 termencoding
26 complete
27 nohlsearch
27 noerrorbells
27 visualbell
28 shortmess
30 showmode
31 wildmode
32 t_Co
32 listchars
32 backupdir
34 hidden
34 backup
35 smarttab
35 foldmethod
36 viminfo
36 textwidth
37 scrolloff
37 nobackup
41 nowrap
44 encoding
47 guifont
51 guioptions
53 smartcase
54 wildmenu
57 smartindent
60 mouse
63 background
64 softtabstop
66 history
70 showmatch
72 ignorecase
74 showcmd
74 laststatus
79 number
82 hlsearch
91 statusline
94 expandtab
94 ruler
96 autoindent
96 backspace
99 tabstop
109 incsearch
114 shiftwidth
124 nocompatible

Out of 155 .vimrcs

Fun fact: nocompatible is the most popular, but also most useless one. The fact that you have an .vimrc automatically implies the nocompatible mode in vim.

How to get the most precise time, comparable between processes in Python?

Let’s consider the following scenario: I have two Python processes receiving the same events and I have to measure the delay between when process A received the event and when process B received it, as precisely as possible (i.e. less than 1ms).

Using Python 2.7 and a Unix system you can use the time.time method which provides the time in seconds since Epoch and has a typical resolution of a fraction of a ms on Unix. You can use it on different processes and still compare the results, since both processes receive the time since Epoch, a defined and fixed time in the past.

On Windows time.time also provides the time since Epoch, but the resolution is in the range of 10ms, which is not suitable for my application.

There is also time.clock which is super precise on Windows, and much less precise on Unix. The mayor drawback is that it returns the time since the process started or since the first call of time.clock within that processes. This means you cannot compare the results of time.clock between two processes as they are not calibrated to a common t-zero.

I had high hopes for Python 3.3 where the time module was revamped and I was reading about time.monotonic and time.perf_counter. Especially time.perf_counter looked like it would suit my needs as the documentation said it provides the highest available resolution for the system and was “system-wide”, in contrast to for example the new time.process_time which was “process_wide”. Unfortunately it turned out that time.perf_counter acts similar to time.clock on Python 2.7 as it provides you with the time since the process started or the first time the method was called within the process. The results of time.monotonic are comparable between processes, but again not precise enough on Windows.

Here is a small script which demonstrates how the times provided by time.clock and time.perf_counter are not comparable between processes. It starts two processes and lets both of them print out the output of the timer to stdout. In the output the times should be monotonically increasing. Since I let process 2 sleep for one second before calling the timer method for the first time, the output of this process is usually one second smaller when using time.clock or time.perf_counter.

#!/usr/bin/env python

from multiprocessing import Process
import time

timers = ['clock', 'time', 'monotonic', 'perf_counter']

def proc(timer):
    timer = getattr(time, timer)
    for i in range(3):
        print('P2 {time}'.format(time=timer()))

if __name__ == '__main__':
    for t in timers:
        print("Using {timer}".format(timer=t))
        p = Process(target=proc, args=(t,))
        timer = getattr(time, t)
        for i in range(3):
            print('P1 {time}'.format(time=timer()))

The result when running on Windows with Python 3.3:

$ python timertest.py
Using clock
P1 6.146032526480321e-06
P1 0.9926582847820045
P2 2.9612702173041547e-05
P1 1.9941743992602412
P2 1.0008579302676737
P2 2.0022709590185346
Using time
P1 1368614235.509732
P1 1368614236.511172
P2 1368614236.601301
P1 1368614237.512612
P2 1368614237.602741
P2 1368614238.604181
Using monotonic
P1 484.636
P1 485.63800000000003
P2 485.738
P1 486.639
P2 486.73900000000003
P2 487.741
Using perf_counter
P1 12.390910576623565
P1 13.39050745276285
P2 7.542858100680394e-06
P1 14.39190763071843
P2 1.0014012954160376
P2 2.0041399116368144

So as far as I see it, there is no way of getting comparable times between two processes on Windows with more precision than 10ms. Is that correct or am I missing something?

Introducing python-ardrone

Flying AR.Drone

The last weeks I spend quite a lot of time hacking on a Python library for the AR.Drone. The AR.Drone is a nice toy for nerds. You connect to it via WIFI and soon you’ll realize that it has 4 ports open. Reading the specs you’ll find, that on one port it listens for AT-Commands with which you can remote control the drone, on the other two ports it waits for an incoming package which will trigger the drone to send the navdata (speed, angles, battery status, etc) and the video stream. Heck, you can even telnet into the drone…

Unfortunately it comes without a proper software to control the drone, only an iPhone app (w/o iPhone of course). But given the documentation, it should be easy to write your own. While getting the beast to fly was relatively easy, decoding the “almost”-jpg-video-stream was not. Almost-jpg, since the images the drone sends are more or less jpg with a small difference which makes it impossible to decode them using standard multi-media libraries. Anyways, the format is documented and implementing a decoder was not that hard. The tricky part was to get the framerates from unacceptable 0.5 FPS to 12-22 FPS — the whole decoder is written in Python. I’m cheating a bit by using psyco, but the code in arvideo.py is heavily optimized to minimize calculations and to please psyco.

In the code is also a small demo app which uses Pygame to display the video stream and allows to control the AR.Drone with the keyboard. It should be ready-to-use as soon as you are connected to the drone via WIFI.

The git repository is here, the license is MIT. Suggestions and patches are welcome.

Here is a video of the drone flying through the office.

Summer School: Advanced Scientific Programming in Python

Once again, there will be another round of the Summer School “Advanced Scientific Programming in Python”. This year in St. Andrews, UK.

Quoting from the official announcement:

Scientists spend more and more time writing, maintaining, and debugging software. While techniques for doing this efficiently have evolved, only few scientists actually use them. As a result, instead of doing their research, they spend far too much time writing deficient code and reinventing the wheel. In this course we will present a selection of advanced programming techniques, incorporating theoretical lectures and practical exercises tailored to the needs of a programming scientist. New skills will be tested in a real programming project: we will team up to develop an entertaining scientific computer game.

We use the Python programming language for the entire course. Python works as a simple programming language for beginners, but more importantly, it also works great in scientific simulations and data analysis. We show how clean language design, ease of extensibility, and the great wealth of open source libraries for scientific computing and data visualization are driving Python to become a standard tool for the programming scientist.

This school is targeted at PhD students and Post-docs from all areas of science. Competence in Python or in another language such as Java, C/C++, MATLAB, or Mathematica is absolutely required. Basic knowledge of Python is assumed. Participants without any prior experience with Python should work through the proposed introductory
materials before the course.

You can apply on-line at http://python.g-node.org

Applications must be submitted before May 29, 2011. Notifications of acceptance will be sent by June 19, 2011.

No fee is charged but participants should take care of travel, living, and accommodation expenses. Candidates will be selected on the basis of their profile. Places are limited: acceptance rate in past editions was around 30%. Prerequisites: You are supposed to know the basics of Python to participate in the lectures. Please consult the website for a list of introductory material.

If your’re a scientist and interested in Python, I cannot recommend this summer school highly enough. The Summer School is always fun, the faculty members are very nice and always willing to help and answer questions. The balance between lectures and exercises is in my opinion very good, and you’ll learn a lot about Python during the week. The highlight of every Python School is always the Pac Man tournament, where groups of students will implement their own Pac Man agent (in Python) and compete against the other agents in a tournament ’till the bitter end!

The students of the previous editions were usually very happy during and after the School. Speaking of happiness, since no fee is charged for the Summer School, you’ll have more money left over to invest into the excellent Scottish pubs!

Query Google Scholar using Python

In desperate need to organize my collection of scientific papers, I had a look at various tools which could help me organizing them. Probably one of the best out there is Mendeley. Mendeley seems to be a very good tool to keep your massive collection of pdfs under control. Unfortunately a very basic function, namely looking up a newly imported paper in Google Scholar to get attributes like: Authors, Year, etc. right, is bundled with a Mendeley account. I guess that’s their way of forcing the user to participate to their community stuff, since without the Google Scholar lookup Mendeley is pretty useless unless you want to fill all the attributes manually.

So I decided to write my own tool to make the lookup. Unfortunately Google does not really want to give away that precious data: they don’t provide an API and even block certain User-Agents from accessing the page. Then, there is also the problem of scraping the results page to get the right data.

The first problem can be trivially solved by setting a common User-Agent string, the second one can be elegantly circumvented by using the bibtex files provided in the search results. The bibtex entries are however only showed if you enabled them in the settings, which are stored in a cookie. After a few tries, I figured that the CF attribute (citation format?) controls which bibliography format should be offered in the results page and CF=4 corresponds to bibtex. Generating a fake cookie is easy, but you have to know what must be included. In this case it looks like a 16 digit hex as ID and the CF attribute is sufficient. The ID is probably supposed to be your id, but a randomly generated one also works like a charm.

The resulting cookie looks like this: GSP=ID=762a112b5c765732:CF=4

All you have to do now is to query Google Scholar using the user string and the cookie:

# fake google id (looks like it is a 16 elements hex)
google_id = hashlib.md5(str(random.random())).hexdigest()[:16]

GOOGLE_SCHOLAR_URL = "http://scholar.google.com"
HEADERS = {'User-Agent' : 'Mozilla/5.0',
        'Cookie' : 'GSP=ID=%s:CF=4' % google_id }

def query(searchstr):
    """Return a list of bibtex items."""
    searchstr = '/scholar?q='+urllib2.quote(searchstr)
    url = GOOGLE_SCHOLAR_URL + searchstr
    request = urllib2.Request(url, headers=HEADERS)
    response = urllib2.urlopen(request)
    html = response.read()
    # grab the bibtex links

And Google Scholar will offer you links to the bibtex files of the results. Getting those links is easy since they all start with "/scholar.bib". Just search for those and download the targets.

The complete code is available on github. It can be used as a python library or a standalone application, you just call it like this: gscolar "some author or title" and it will print the first ten results in bibtex to stdout.

The sorry state of Python in Debian

Looking at the sorry state of Python in Debian, makes me wonder if we shouldn’t enforce team maintainance of packages above a certain popularity/importance/whatever threshold. People worked hard in the last months to fix any bugs that would prevent Python2.6 to land in unstable and yet nothing happens. Time passes by and we will eventually end up with Squeeze having a horribly outdated Python version.

reportbug-ng has localization support again

After having ported reportbug-ng from PyQt3 to PyQt4 over a year ago, reportbug-ng lost it’s localization, since the gettext based translations where incompatible with Qt4′s translation system.

This weekend I finally had the time to have a closer look at this problem. To make a long story short: I have ported the gettext based system to Qt4′s system. All the old .po files where converted to .ts files, but almost all strings are marked as “obsolete” so that they don’t appear in the translated program. But since they are still available in the .ts file, it is easy to get the translations up-to-date. So far only English and German are complete, but eventually other translations will be added.

PyQt4 makes it by the way really hard to get non-Qt strings translated.

Python 2.6 Transition

Today I NMUed over a dozen of Python packages with bugs which blocked the Python 2.6 transition.

I really want to thank Kumar Appaiah for his work. He provided patches for all the bugs I NMUed today and lots more. I really did not much more than applying, testing and uploading his patches, but Kumar probably invested days of labor to create the patches and test them. Thanks to his effort, the number of 2.6-blockers shrinked considerably so that we now have like ~15 open blockers and ~50 closed ones!