Sequence Alignment of Identical Protein Sequences

Create multiple alignments of amino acid sequences according to the method of Edgar.

seqaln.pair(aln, ...)

Arguments

aln	a sequence character matrix, as obtained from `seqbind`, or an alignment list object as obtained from `read.fasta`.
...	additional arguments for the function `seqaln`.

Details

This function is intended for the alignment of identical sequences only. For standard alignment see the related function seqaln.

This function is useful for determining the equivalences between sequences and structures. For example in aligning a PDB sequence to an existing multiple sequence alignment, where one would first mask the alignment sequences and then run the alignment to determine equivalences.

Value

A list with two components:

ali

an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.

ids

sequence names as identifers.

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

‘MUSCLE’ is the work of Edgar: Edgar (2004) Nuc. Acid. Res. 32, 1792--1797.

Full details of the ‘MUSCLE’ algorithm, along with download and installation instructions can be obtained from:
http://www.drive5.com/muscle.

Author

Barry Grant

Note

A system call is made to the ‘MUSCLE’ program, which must be installed on your system and in the search path for executables.

Examples


## NOTE: FOLLOWING EXAMPLE NEEDS MUSCLE INSTALLED
if(check.utility("muscle")) {

##- Aligning a PDB sequence to an existing sequence alignment


##- Simple example
aln <- seqbind(c("X","C","X","X","A","G","K"),
               c("C","-","A","X","G","X","X","K"))

seqaln.pair(aln, outfile = tempfile())

}
#>        1        10 
#> seq1   XCXXA-G--K
#> seq2   -C--AXGXXK
#>        ^*^^*^*^^* 
#>        1        10 
#> 
#> Call:
#>   seqaln.pair(aln = aln, outfile = tempfile())
#> 
#> Class:
#>   fasta
#> 
#> Alignment dimensions:
#>   2 sequence rows; 10 position columns (4 non-gap, 6 gap) 
#> 
#> + attr: id, ali, call