Bio3D-web is a new online application, built on top of the Bio3D package, for the user friendly investigation of protein structure ensembles.

Major functionality allows you to map and explore the structural, conformational and internal dynamic properties of proteins for which there are high resolution structures available. Read More >>


Why an R package?

Bio3D aims to leverage the extensive graphical and statistical capabilities of the R environment ( and thus provide a useful integrated framework for the exploratory interactive analysis of biomolecular sequence and structure data.

Do I need to know R?

To get the most out of Bio3D you should be quite familiar with basic R usage. Some newcomers to R find this a steep learning curve. However, once you have mastered basic operations with vectors and matrices in R you should feel confident about getting stuck into using the Bio3D package.

There are now numerous on–line resources that can help you get started using R. A number of these can be found from the main R website at We particularly like the following:

  • Try R: an interactive R tutorial in your web browser
  • An introduction to R: The offical R manual
  • Learn R: Learn by doing in your web browser (requires free registration)

There are also several books providing valuable skills in R:

What can I do with Bio3D?

Features include the ability to read and write biomolecular structure, sequence and dynamic trajectory data, query and search online sequence and structure databases, perform atom selection, re-orientation, superposition, rigid core identification, clustering, distance matrix analysis, alignment, conservation analysis, normal mode analysis, principal component analysis, and many other common sequence and structural analysis tasks.

How can I install Bio3D?

R and Bio3D can be installed on all commonly used architectures and operative systems such as Linux, Mac OSX, and Windows. The Bio3D Installation guide provides a thorough description of the process, which  can be summarized into one line of code from the R console:


Where can I find more information?

From the main Bio3D website you can find the latest version of the packagefull documentation and extensive vignettes and tutorials.

Where can I get help?

We provide a Q&A and issue tracker system at our Bitbucket site: <>

How can I contribute my ideas and code?

We are always interested in adding additional functionality to Bio3D. If you have ideas, suggestions or code that you would like to distribute as part of this package, please contact us (see below). You are also encouraged to contribute your code or issues directly to this repository for incorporation into the development version of the package. For details on how to do this please see the developer wiki.




Installing Bio3D Detailed instructions for installing Bio3D with required libraries and suggested packages.
Getting started Hands on introduction to Bio3D for new users of the R environment.
Beginning structure analysis Explore, analyze and manipulate PDB structures.
Trajectory analysis Task-oriented introduction to basic molecular dynamics (MD) trajectory analysis.
Principal component analysis Introduction to principal component analysis (PCA) of multiple PDB structures.
Normal modes analysis Worked examples of single structure normal modes analysis (NMA).
Ensemble normal mode analysis I Performing NMA on heterogenous multiple structure sets - part 1.
Ensemble normal mode analysis II Performing NMA on heterogenous multiple structure sets - part 2.
Protein structure network analysis Detailed introduction to correlation network analysis from both MD and NMA.
Biomolecular structure visualization Introduction to interactive 3D structure visualization with Bio3D.



Why should I use R?

If you are new to R and Bio3D then a couple questions might naturally arise:

  1. What is R?
  2. What are the pros and cons of using R? 
  3. Why use it instead of, say, a spreadsheet application or a application such as matlab?

R is an environment for data analysis

R is a powerful environment and programming language for the analysis of numerical data. While there are many other common applications that will allow you to manipulate lists of numbers (e.g., spreadsheet programs), R also allows for the easy calculation of a number of quantities, provides a powerful environment for performing numerical simulations and has fantastic graphics capabilities. Also R is free!


What R lacks in apparent user-friendliness, it more than makes up for in power. While there is certainly a learning curve associated with developing the skills you will need to perform analyses in R, this is really true of any advanced software package that you will use. Once you acquire some of the basics, you will find that using R is logical and simple.


The language used by R is a "dialect" of the S statistical programming language. To quote John Chambers (major contributor and developer of the S language), “S is a programming language and environment for all kinds of computing involving data. It has a simple goal: to turn ideas into software, quickly and faithfully.” 

  • R is a free software implementation of the S language (
  • R was first developed by R. Gentleman and R. Ihaka (U of Auckland, NZ) during the 1990s
  • R had developed into an advanced statistical computing system, freely available for most computing platforms.
  • Updated versions are available every 3-4 months 


The Pros and Cons of R

Pros include:

  1. Powerful, state-of-the-art
  2. Used by professional statisticians
  3. Lot of documentation
  4. Learn by example
  5. Easy to extend, Modify and improve with add-on packages
  6. Freely available for Unix, Windows & Mac
  7. Extendable, with numerous add-on packages available.
  8. Programmable: if R can’t do a particular task, you can program R to do it.
  9. R produces publication quality graphics.

R has a remarkable online presence in the form of help lists, tutorials, etc. which will facilitate solving the problems you inevitably run into in the course of your research. R represents the state-of-the-art in statistical computing. 


Cons include:

  • Not very easy to learn (many details)
  • Easy to forget
  • Sometimes forced to learn by example
  • Documentation sometimes cryptic
  • Not very (easily) interactive in the Excel point and click sense
  • Command-based
  • Still evolving: backward-compatibility has been an issue
  • Slow at times when compared to dedicated C etc implementation for a particular task.  

If you “just want to do basic statistical analysis” then its easy to find alternatives

If you intend to do exploratory data analysis such as protein structural bioinformatics and bioinformatics tasks including exprsssion analysis then its probably one of best options.


Why Not a Spreadsheet?

While a spreadsheet is handy for manually entering and viewing small amounts of data along with guiding basic calculations, it is not ideal for more advanced problems on large datasets.

For example, calculating an eigenvalue or numerically solving ordinary differential equations. These are a simple task in an environment such as R or Matlab, but do not exist (to the best of my knowledge in most common spreadsheet applications.


Why not Matlab



More Articles ...

Bio3D Install

Installing Bio3D


Beginning Structure Analysis

Beginning Trajectory Analysis

Comparative sequence and structure analysis

Introduction to correlation network analysis


Normal mode analysis