pdbsplit. bio3d 2.3-0

Usage

pdbsplit(pdb.files, ids = NULL, path = "split_chain", overwrite=TRUE, verbose = FALSE, mk4=FALSE, ncore = 1, ...)

Arguments

pdb.files: a character vector of PDB file names.
ids: a character vector of PDB and chain identifiers (of the form: ‘pdbId_chainId’, e.g. ‘1bg2_A’). Used for filtering chain IDs for output (in the above example only chain A would be produced).
path: output path for chain-split files.
overwrite: logical, if FALSE the PDB structures will not be read and written if split files already exist.
verbose: logical, if TRUE details of the PDB header and chain selections are printed.
mk4: logical, if TRUE output filenames will use only the first four characters of the input filename (see basename.pdb for details).
ncore: number of CPU cores used for the calculation. ncore>1 requires package ‘parallel’ be installed.
...: additional arguments to read.pdb. Useful e.g. for parsing multi model PDB files, including ALT records etc. in the output files.

Description

Split a Protein Data Bank (PDB) coordinate file into new separate files with one file for each chain.

Details

This function will produce single chain PDB files from multi-chain input files. By default all separate filenames are returned. To return only a subset of select chains the optional input ‘ids’ can be provided to filter the output (e.g. to fetch only chain C, of a PDB object with additional chains A+B ignored). See examples section for further details.

Note that multi model atom records will only split into individual PDB files if multi=TRUE, else they are omitted. See examples.

Value

Returns a character vector of chain-split file names.

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

For a description of PDB format (version3.3) see: http://www.wwpdb.org/documentation/format33/v3.3.html.

Examples

  ## Save separate PDB files for each chain of a local or on-line file
  pdbsplit( get.pdb("2KIN", URLonly=TRUE) )


  |======================================================================| 100%

[1] "split_chain/2KIN_A.pdb" "split_chain/2KIN_B.pdb"



  ## Split several PDBs by chain ID and multi-model records
  raw.files <- get.pdb( c("1YX5", "3NOB") , URLonly=TRUE)
  chain.files <- pdbsplit(raw.files,  path=tempdir(), multi=TRUE)


  |======================================================================| 100%

  basename(chain.files)

 [1] "1YX5_A.01.pdb" "1YX5_A.02.pdb" "1YX5_A.03.pdb" "1YX5_A.04.pdb"
 [5] "1YX5_A.05.pdb" "1YX5_A.06.pdb" "1YX5_A.07.pdb" "1YX5_A.08.pdb"
 [9] "1YX5_A.09.pdb" "1YX5_A.10.pdb" "1YX5_A.11.pdb" "1YX5_A.12.pdb"
[13] "1YX5_A.13.pdb" "1YX5_A.14.pdb" "1YX5_A.15.pdb" "1YX5_A.16.pdb"
[17] "1YX5_A.17.pdb" "1YX5_A.18.pdb" "1YX5_B.01.pdb" "1YX5_B.02.pdb"
[21] "1YX5_B.03.pdb" "1YX5_B.04.pdb" "1YX5_B.05.pdb" "1YX5_B.06.pdb"
[25] "1YX5_B.07.pdb" "1YX5_B.08.pdb" "1YX5_B.09.pdb" "1YX5_B.10.pdb"
[29] "1YX5_B.11.pdb" "1YX5_B.12.pdb" "1YX5_B.13.pdb" "1YX5_B.14.pdb"
[33] "1YX5_B.15.pdb" "1YX5_B.16.pdb" "1YX5_B.17.pdb" "1YX5_B.18.pdb"
[37] "3NOB_A.pdb"    "3NOB_B.pdb"    "3NOB_C.pdb"    "3NOB_D.pdb"   
[41] "3NOB_E.pdb"    "3NOB_F.pdb"    "3NOB_G.pdb"    "3NOB_H.pdb"   



  ## Output only desired pdbID_chainID combinations
  ## for the last entry (1f9j), fetch all chains
  ids <- c("1YX5_A", "3NOB_B", "1F9J")
  raw.files <- get.pdb( ids , URLonly=TRUE)
  chain.files <- pdbsplit(raw.files, ids, path=tempdir())


  |======================================================================| 100%

  basename(chain.files)

[1] "1YX5_A.pdb" "3NOB_B.pdb" "1F9J_A.pdb" "1F9J_B.pdb"

Author

Barry Grant

Split a PDB File Into Separate Files, One For Each Chain.