Introduction to Biopython

Trainer Kristian Rother


  • Understand what Biopython is and what it can do.
  • Learn how to get Biopython running.
  • Learn how to retrieve data records from NCBI.
  • Learn how to read and write sequence files.
  • Learn how to run BLAST from Python and read the results.
  • Learn how to read and write phylogenetic tree files.
  • Learn how to read and write 3D structure files.
  • Learn how to use the Biopython documentation, examples, and where to find help.
  • Understand what alternatives to Biopython exist and what they can do.


Biopython is the best-known Python library to process biological data. This training is aimed to empower you to use Biopython to make your research more efficient.

The first day of the training is to give an overview of Biopython. You are going to start with your first steps in Biopython on the command line. Afterwards you will take a tour of the most important components: sequences, NCBI queries, BLAST, trees, and 3D structures. You will try each of these modules on practical examples. Please don't hesitate to ask questions about Python basics or particular data formats (e.g. XML or NGS data).

The second day of the training is to broaden your perspective: What other features does the library have? How can you use the documentation effectively? What is Biopython not capable of? What can I do to visualize my data? Are there alternatives? If you have your own data that you would like to work on with Biopython in more detail, there is room for that.

For us, the most important thing is to identify concrete Python modules and functions that help you to get your research done.

Participants are encouraged to submit a description of their research topic and/or the questions they would like to answer with Biopython. Additionally, participants can bring their own data that they would like to process in Python to the training.




See the TRAINING AT VIB website for a detailed schedule of this training

Training material


Other links:

  • Biopython website: download, cookbook and tutorial
  • Exercises on Rosalind website (registration required): large repository of bioinformatics problems that can be solved by writing (Bio)python scripts e.g. counting nucleotides in DNA, transcribing DNA, writing complementary strands, calculating GC content, counting mutations...
  • Matplot library
Scientific topics Software engineering
Target audience Life Science Researchers, PhD students, post-docs, beginner bioinformaticians