Add a Sequence to an Existing Alignmnet

Usage

seq2aln(seq2add, aln, id = "seq", file = "aln.fa", ...)

Arguments

seq2add
an sequence character vector or an alignment list object with id and ali components, similar to that generated by read.fasta and seqaln.
aln
an alignment list object with id and ali components, similar to that generated by read.fasta and seqaln.
id
a vector of sequence names to serve as sequence identifers.
file
name of ‘FASTA’ output file to which alignment should be written.
...
additional arguments passed to seqaln.

Description

Add one or more sequences to an existing multiple alignment that you wish to keep intact.

Details

This function calls the ‘MUSCLE’ program, to perform a profile profile alignment, which MUST BE INSTALLED on your system and in the search path for executables.

Value

A list with two components:
ali
an alignment character matrix with a row per sequence and a column per equivalent aminoacid/nucleotide.

id
sequence names as identifers.

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

‘MUSCLE’ is the work of Edgar: Edgar (2004) Nuc. Acid. Res. 32, 1792--1797.

Full details of the ‘MUSCLE’ algorithm, along with download and installation instructions can be obtained from: http://www.drive5.com/muscle.

Note

A system call is made to the ‘MUSCLE’ program, which must be installed on your system and in the search path for executables.

Examples

aa.1 <- pdbseq( read.pdb("1bg2") )
Note: Accessing on-line PDB file
aa.2 <- pdbseq( read.pdb("3dc4") )
Note: Accessing on-line PDB file
aa.3 <- pdbseq( read.pdb("1mkj") )
Note: Accessing on-line PDB file
aln <- seqaln( seqbind(aa.1,aa.2) ) seq2aln(aa.3, aln)
1 . . . . . . 70 seq1 -DLAECNIKVMCRFRPLNESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVFQSSTSQEQVYNDCAKKIVK seq2 --AKLSAVRIAVREAPYRPSVVQ-----FPPWSDGKSLIVDQNEFHFDHAFPATISQDEMYQALILPLVD seq ADLAECNIKVMCRFRPLNESEVNRGDKYIAKFQGEDTVVIASKPYAFDRVFQSSTSQEQVYNDCAKKIVK ^^^ * * * *^ ^ ^^^^ ^ **^ * ^ **^ ^*^ ^* 1 . . . . . . 70 71 . . . . . . 140 seq1 DVLEGYNGTIFAYGQTSSGKTHTM----EGKLHDPEGMGIIPRIVQDIFNYIYSMDENLEFHIKV--SYF seq2 KLLEGFQCTALAYGQTGTGKSYSMGMTPPGEIL-PEHLGILPRALGDIFERVTARQENNKDAIQVYASFI seq DVLEGYNGTIFAYGQTSSGKTHTM----EGKLHDPEGMGIIPRIVQDIFNYIYSMDENLEFHIKV--SYF ^***^^ * ***** ^**^ ^* * ^ ** ^**^** ^ *** ^ ** * * *^ 71 . . . . . . 140 141 . . . . . . 210 seq1 EIYLDKIRDLLDVSKTNLSVHEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKSNRHVAVTNMNEHSSRSH seq2 EIYNEKPFDLLGSTP----------HMPMVAARCQRCTCLP------LHSQADLHHILELGTRNRRSRSH seq EIYLDKIRDLLDVSKTNLSVHEDKNRVPYVKGCTERFVCSPDEVMDTIDEGKSNRHVAVTNMNEHSSRSH *** ^* *** ^ ^^* * ^ * * * ^ ^*^ ^ **** 141 . . . . . . 210 211 . . . . . . 280 seq1 SIFLINVKQENTQTEQKLSGKLYLVDLAGSEKVSKTGAEGAVLDEAKNINKSLSALGNVISALAEGSTYV seq2 AIVTIHVKSKTHHS------RMNIVDLAGSEGVV-------------NINLGLLSINKVVMSMAAGHTVI seq SIFLINVKQENTQTEQKLSGKLYLVDLAGSEKVA------------KNINKSLSALGNVISALAEGSTYV * * ** ^ ^^ ^******* * *** * ^ *^ ^* * * ^ 211 . . . . . . 280 281 . . . . . . 350 seq1 PYRDSKMTRILQDSLGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTI-------------------- seq2 PYRDSVLTTVLQASLTAQSYLTFLACISPHQCDLSETLSTLRFGTSAKAAALEH---------------- seq PYRDSKMTRILQDSLGGNCRTTIVICCSPSSYNESETKSTLLFGQRAKTIKNTVCVNVELTAEQWKKKYE ***** ^* ^** ** ^^ * ^ * ** *** *** ** ** 281 . . . . . . 350 351 354 seq1 ---- seq2 ---- seq KEKE 351 354 Call: seq2aln(seq2add = aa.3, aln = aln) Class: fasta Alignment dimensions: 3 sequence rows; 354 position columns (281 non-gap, 73 gap) + attr: id, ali, call

See also

seqaln, read.fasta, read.fasta.pdb, seqbind

Author

Barry Grant