The Molecular Graphics Laboratory, The Scripps Research Institute, La Jolla CA.
 



This tutorial will demonstrate the utility of the interpreted programming language Python for the rapid development of component-based applications for structural bioinformatics. We will introduce the language itself, along with some of its most important extension modules. Bio-informatics specific extensions will also be described and we will demonstrate how these components have been assembled to create custom applications.

Python is an interpreted, interactive, object-oriented programming language. Because of its platform independence and object-oriented nature, it is often compared to Java. Python combines remarkable power with very clear syntax. It has modules, classes, exceptions, very high level data types, and dynamic typing. There are interfaces to many system calls and libraries, as well as to various windowing systems (X11, Motif, Tk, Mac, MFC). Legacy code written in C, C++ or Fortran can easely be made avalaible within a Python interpreter. Python is also usable as an extension language for applications that need a programmable interface. The Python implementation is portable: it runs on many brands of UNIX, on Windows, DOS, OS/2, Mac, Amiga, etc, and it is distributed under an OpenSource license agreement. An active community of users and programmers are constantly contributing new modules, extending the language with novel capabilities.

We have been using Python as a scripting framework to assemble basic components at a high-level. This approach enables the rapid development of custom applications for the visualization and manipulation of molecular data and provides an excellent level of code re-use. Besides being a very powerful "glue" language, its object-oriented nature makes it an appropriate language for the development of components. We have developed a number of Python extensions, also called packages, to deal with different aspects of structural biology. In this course, we will present some of these components and explain how they have been combined to create applications such as PMV: the Python Molecule Viewer and ADT: the AutoDockTools suite. Time permitting , we will also survey other Python-based applications such as MMTK and Chimera.

We will start by briefly discussing the characteristics of Python which make it a language of choice for both developing components and combining them to create applications. This part will include a short overview of the language and its syntax as well as an overview of the standard module library that comes with the language. We will briefly address the issues of extending the language and provide useful pointers for getting started with utilities like SWIG that help with making legacy C or C++ code available in Python. In this section we will also give an introduction to the Numeric Python extension which is used to efficiently store and manipulate large arrays of numerical data.

The next section will be devoted to extensions to the language that have been developed at the Molecular Graphics Laboratory of the Scripps Research Institute.
These include:

In the third section we will present tools we have developed using the basic components presented earlier.

In this section we will also have a short tutorial on extending PMV. We will take the attendees, step by step through the process of adding a command.

Time permitting, we will give a quick overview of other Python based tools such as MMTK and Chimera and point out how to have them inter-operate with the ones we have presented.

We anticipate that after this tutorial attendees will be able to use Python-based tools for their own work and understand the tools architecture allowing them to modify the behavior of commands and even develop new commands.

All the documentation used for this tutorial will be made available on-line before the conference so the potential attendees can familiarize themselves with both Python and the components that will be presented. We will adjust the level of this tutorial to the attendees' level of familiarity with Python and our Python-based components. Previous experience in developing applications will be helpful, although not required.

Suggested readings:



Michel SANNER:
Dr. Sanner obtained his PhD. in Computer Science in 1992 from the University of Haute-Alsace in France for his work on the modelisation of molecular surfaces.
This research has been carried out while working in the molecular modelling group of the pharmaceutical company Sandoz in Basel Switzerland. In 1993, Dr. Sanner accepted a Post Doctoral position in the Molecular Graphics Laboratory of the Scripps Research Institute. He is currently an assistant professor in the Molecular Biology department of the Scripps Research Institute.
He is the author of the program MSMS which enables the efficient and accurate calculation of molecular surfaces and which has distributed to more than 950 laboratories around the world. He has worked in collaboration with Boris Reva and Alexei Finkelstein on the determination of phenomenological residue pair-wise interaction potentials. More recently, he has lead the development of a Python-based scripting environment enabling the rapid development of customized applications to view, manipulate and analyze molecular data. His involvment with Python community goes back to 1997.

Sophie COON:
Sophie Coon received her BS in Cell Biology from the University of Paris 6, France in 1996. After spending few more years at the bench she decided to combine her biological background with her strong interest in computer science. She received her MS in Computer Science Applied to Biology from the University of Paris 6, France in 1999.
She joined Dr. Arthur Olson's group at The Scripps Research Institute in May 1999 first as an intern then as Research Programmer II. There she started to program in Python under Dr. Michel Sanner's supervision and hasn't stopped since.
Her main responsability is to implement new components in Python that will be integrated into the Python based molecular modelling environment developed in the laboratory and more specifically to the Python Molecule Viewer.
In addition, she has a strong interest in making the software developed in her laboratory available to potential users or developers on the internet. To this end, she participated in the development of a Python based distribution package which facilitates the propagation of their software for which she is also writing documentation.