mustang.Rd
Create a multiple sequence alignment from a bunch of PDB files.
mustang(files, exefile="mustang", outfile="aln.mustang.fa", cleanpdb=FALSE, cleandir="mustangpdbs", verbose=TRUE)
files | a character vector of PDB file names. |
---|---|
exefile | file path to the ‘MUSTANG’ program on your system (i.e. how is ‘MUSTANG’ invoked). |
outfile | name of ‘FASTA’ output file to which alignment should be written. |
cleanpdb | logical, if TRUE iterate over the PDB files and map non-standard residues to standard residues (e.g. SEP->SER..) to produce ‘clean’ PDB files. |
cleandir | character string specifying the directory in which the ‘clean’ PDB files should be written. |
verbose | logical, if TRUE ‘MUSTANG’ warning and error messages are printed. |
Structure-based sequence alignment with ‘MUSTANG’ attempts to arrange and align the sequences of proteins based on their 3D structure.
This function calls the ‘MUSTANG’ program, to perform a multiple structure alignment, which MUST BE INSTALLED on your system and in the search path for executables.
Note that non-standard residues are mapped to “Z” in MUSTANG. As a workaround the bio3d ‘mustang’ function will attempt to map any non-standard residues to standard residues (e.g. SEP->SER, etc). To avoid this behaviour use ‘cleanpdb=FALSE’.
A list with two components:
an alignment character matrix with a row per sequence and a column per equivalent aminoacid.
sequence names as identifers.
Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.
‘MUSTANG’ is the work of Konagurthu et al: Konagurthu, A.S. et al. (2006) Proteins 64(3):559--74.
More details of the ‘MUSTANG’ algorithm, along with download and
installation instructions can be obtained from:
http://www.csse.monash.edu.au/~karun/Site/mustang.html.
Lars Skjaerven
A system call is made to the ‘MUSTANG’ program, which must be installed on your system and in the search path for executables.
if (FALSE) { if(!check.utility('mustang')) { message('Need MUSTANG installed to run this example') } else { ## Fetch PDB files and split to chain A only PDB files ids <- c("1a70_A", "1czp_A", "1frd_A") files <- get.pdb(ids, split = TRUE, path = tempdir()) ##-- Or, read a folder/directory of existing PDB files #pdb.path <- "my_dir_of_pdbs" #files <- list.files(path=pdb.path , # pattern=".pdb", # full.names=TRUE) ##-- Align these PDB sequences aln <- mustang(files) ##-- Read Aligned PDBs storing coordinate data pdbs <- read.fasta.pdb(aln) } }