read.pqr. bio3d 2.3-0

Usage

read.pqr(file, maxlines = -1, multi = FALSE, rm.insert = FALSE, rm.alt = TRUE, verbose = TRUE)

Arguments

file: the name of the PQR file to be read.
maxlines: the maximum number of lines to read before giving up with large files. By default if will read up to the end of input on the connection.
multi: logical, if TRUE multiple ATOM records are read for all models in multi-model files.
rm.insert: logical, if TRUE PDB insert records are ignored.
rm.alt: logical, if TRUE PDB alternate records are ignored.
verbose: print details of the reading process.

Description

Read a PQR coordinate file.

Details

PQR file format is basically the same as PDB format except for the fields of o and b. In PDB, these two fields are filled with ‘Occupancy’ and ‘B-factor’ values, respectively, with each field 6-column long. In PQR, they are atomic ‘partial charge’ and ‘radii’ values, respectively, with each field 8-column long.

maxlines may require increasing for some large multi-model files. The preferred means of reading such data is via binary DCD format trajectory files (see the read.dcd function).

Value

atom: a data.frame containing all atomic coordinate ATOM and HETATM data, with a row per ATOM/HETATM and a column per record type. See below for details of the record type naming convention (useful for accessing columns).
helix: ‘start’, ‘end’ and ‘length’ of H type sse, where start and end are residue numbers “resno”.
sheet: ‘start’, ‘end’ and ‘length’ of E type sse, where start and end are residue numbers “resno”.
seqres: sequence from SEQRES field.
xyz: a numeric matrix of class "xyz" containing the ATOM and HETATM coordinate data.
calpha: logical vector with length equal to nrow(atom) with TRUE values indicating a C-alpha “elety”.
call: the matched call.

References

Grant, B.J. et al. (2006) Bioinformatics 22, 2695--2696.

For a description of PDB format (version3.3) see: http://www.wwpdb.org/documentation/format33/v3.3.html.

Note

For both atom and het list components the column names can be used as a convenient means of data access, namely: Atom serial number “eleno” , Atom type “elety”, Alternate location indicator “alt”, Residue name “resid”, Chain identifier “chain”, Residue sequence number “resno”, Code for insertion of residues “insert”, Orthogonal coordinates “x”, Orthogonal coordinates “y”, Orthogonal coordinates “z”, Occupancy “o”, and Temperature factor “b”. See examples for further details.

Examples


# PDB server connection required - testing excluded

# Read a PDB file and write it as a PQR file
pdb <- read.pdb( "4q21" )

  Note: Accessing on-line PDB file

outfile = file.path(tempdir(), "eg.pqr")
write.pqr(pdb=pdb, file = outfile)

# Read the PQR file
pqr <- read.pqr(outfile)

## Print a brief composition summary
pqr


 Call:  read.pqr(file = outfile)

   Total Models#: 1
     Total Atoms#: 1447,  XYZs#: 4341  Chains#: 1  (values: A)

     Protein Atoms#: 1340  (residues/Calpha atoms#: 168)
     Nucleic acid Atoms#: 0  (residues/phosphate atoms#: 0)

     Non-protein/nucleic Atoms#: 107  (residues: 80)
     Non-protein/nucleic resid values: [ GDP (1), HOH (78), MG (1) ]

   Protein sequence:
      MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAG
      QEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDL
      AARTVESRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQHKL

+ attr: atom, helix, sheet, seqres, xyz,
        calpha, call


## Examine the storage format (or internal *str*ucture)
str(pqr)

List of 7
 $ atom  :'data.frame':	1447 obs. of  16 variables:
  ..$ type  : chr [1:1447] "ATOM" "ATOM" "ATOM" "ATOM" ...
  ..$ eleno : num [1:1447] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ elety : chr [1:1447] "N" "CA" "C" "O" ...
  ..$ alt   : chr [1:1447] NA NA NA NA ...
  ..$ resid : chr [1:1447] "MET" "MET" "MET" "MET" ...
  ..$ chain : chr [1:1447] "A" "A" "A" "A" ...
  ..$ resno : num [1:1447] 1 1 1 1 1 1 1 1 2 2 ...
  ..$ insert: chr [1:1447] NA NA NA NA ...
  ..$ x     : num [1:1447] 64.1 64 63.7 64.4 65.4 ...
  ..$ y     : num [1:1447] 50.5 51.6 52.8 53.1 51.8 ...
  ..$ z     : num [1:1447] 32.5 33.4 32.7 31.7 34.2 ...
  ..$ o     : num [1:1447] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ b     : num [1:1447] 28.7 29.2 30.3 34.9 28.5 ...
  ..$ segid : chr [1:1447] NA NA NA NA ...
  ..$ elesy : chr [1:1447] NA NA NA NA ...
  ..$ charge: chr [1:1447] NA NA NA NA ...
 $ helix :List of 4
  ..$ start: num(0) 
  ..$ end  : num(0) 
  ..$ chain: chr(0) 
  ..$ type : chr(0) 
 $ sheet :List of 4
  ..$ start: num(0) 
  ..$ end  : num(0) 
  ..$ chain: chr(0) 
  ..$ sense: chr(0) 
 $ seqres: NULL
 $ xyz   : xyz [1, 1:4341] 64.1 50.5 32.5 64 51.6 ...
  ..- attr(*, "class")= chr [1:2] "xyz" "matrix"
 $ calpha: logi [1:1447] FALSE TRUE FALSE FALSE FALSE FALSE ...
 $ call  : language read.pqr(file = outfile)
 - attr(*, "class")= chr [1:2] "pdb" "sse"


## Print data for the first four atom
pqr$atom[1:4,]

  type eleno elety  alt resid chain resno insert      x      y      z o     b
1 ATOM     1     N <NA>   MET     A     1   <NA> 64.080 50.529 32.509 1 28.66
2 ATOM     2    CA <NA>   MET     A     1   <NA> 64.044 51.615 33.423 1 29.19
3 ATOM     3     C <NA>   MET     A     1   <NA> 63.722 52.849 32.671 1 30.27
4 ATOM     4     O <NA>   MET     A     1   <NA> 64.359 53.119 31.662 1 34.93
  segid elesy charge
1  <NA>  <NA>   <NA>
2  <NA>  <NA>   <NA>
3  <NA>  <NA>   <NA>
4  <NA>  <NA>   <NA>


## Print some coordinate data
head(pqr$atom[, c("x","y","z")])

       x      y      z
1 64.080 50.529 32.509
2 64.044 51.615 33.423
3 63.722 52.849 32.671
4 64.359 53.119 31.662
5 65.373 51.805 34.158
6 65.122 52.780 35.269


## Print C-alpha coordinates (can also use 'atom.select' function)
head(pqr$atom[pqr$calpha, c("resid","elety","x","y","z")])

   resid elety      x      y      z
2    MET    CA 64.044 51.615 33.423
10   THR    CA 62.439 54.794 32.359
17   GLU    CA 63.968 58.232 32.801
26   TYR    CA 61.817 61.333 33.161
38   LYS    CA 63.343 64.814 33.163
47   LEU    CA 61.321 67.068 35.557

inds <- atom.select(pqr, elety="CA")
head( pqr$atom[inds$atom, ] )

   type eleno elety  alt resid chain resno insert      x      y      z o     b
2  ATOM     2    CA <NA>   MET     A     1   <NA> 64.044 51.615 33.423 1 29.19
10 ATOM    10    CA <NA>   THR     A     2   <NA> 62.439 54.794 32.359 1 28.10
17 ATOM    17    CA <NA>   GLU     A     3   <NA> 63.968 58.232 32.801 1 30.95
26 ATOM    26    CA <NA>   TYR     A     4   <NA> 61.817 61.333 33.161 1 23.42
38 ATOM    38    CA <NA>   LYS     A     5   <NA> 63.343 64.814 33.163 1 21.34
47 ATOM    47    CA <NA>   LEU     A     6   <NA> 61.321 67.068 35.557 1 18.99
   segid elesy charge
2   <NA>  <NA>   <NA>
10  <NA>  <NA>   <NA>
17  <NA>  <NA>   <NA>
26  <NA>  <NA>   <NA>
38  <NA>  <NA>   <NA>
47  <NA>  <NA>   <NA>


## The atom.select() function returns 'indices' (row numbers)
## that can be used for accessing subsets of PDB objects, e.g.
inds <- atom.select(pqr,"ligand")
pqr$atom[inds$atom,]

       type eleno elety  alt resid chain resno insert      x      y      z o
1341 HETATM  1342    MG <NA>    MG     A   273   <NA> 65.614 76.977 46.715 1
1342 HETATM  1343    PB <NA>   GDP     A   274   <NA> 62.667 77.781 47.505 1
1343 HETATM  1344   O1B <NA>   GDP     A   274   <NA> 61.587 77.413 46.626 1
1344 HETATM  1345   O2B <NA>   GDP     A   274   <NA> 63.294 79.098 47.336 1
1345 HETATM  1346   O3B <NA>   GDP     A   274   <NA> 63.804 76.731 47.410 1
1346 HETATM  1347   O3A <NA>   GDP     A   274   <NA> 62.281 77.644 49.012 1
1347 HETATM  1348    PA <NA>   GDP     A   274   <NA> 62.781 76.563 50.116 1
1348 HETATM  1349   O1A <NA>   GDP     A   274   <NA> 64.200 76.858 50.463 1
1349 HETATM  1350   O2A <NA>   GDP     A   274   <NA> 62.459 75.187 49.671 1
1350 HETATM  1351   O5' <NA>   GDP     A   274   <NA> 61.927 76.929 51.222 1
1351 HETATM  1352   C5' <NA>   GDP     A   274   <NA> 61.690 78.290 51.572 1
1352 HETATM  1353   C4' <NA>   GDP     A   274   <NA> 61.260 78.393 53.002 1
1353 HETATM  1354   O4' <NA>   GDP     A   274   <NA> 59.989 77.748 53.185 1
1354 HETATM  1355   C3' <NA>   GDP     A   274   <NA> 62.181 77.747 54.015 1
1355 HETATM  1356   O3' <NA>   GDP     A   274   <NA> 62.291 78.499 55.179 1
1356 HETATM  1357   C2' <NA>   GDP     A   274   <NA> 61.548 76.420 54.295 1
1357 HETATM  1358   O2' <NA>   GDP     A   274   <NA> 61.846 76.085 55.643 1
1358 HETATM  1359   C1' <NA>   GDP     A   274   <NA> 60.078 76.792 54.224 1
1359 HETATM  1360    N9 <NA>   GDP     A   274   <NA> 59.258 75.630 53.844 1
1360 HETATM  1361    C8 <NA>   GDP     A   274   <NA> 59.255 75.041 52.612 1
1361 HETATM  1362    N7 <NA>   GDP     A   274   <NA> 58.334 74.158 52.460 1
1362 HETATM  1363    C5 <NA>   GDP     A   274   <NA> 57.550 74.278 53.590 1
1363 HETATM  1364    C6 <NA>   GDP     A   274   <NA> 56.499 73.638 53.877 1
1364 HETATM  1365    O6 <NA>   GDP     A   274   <NA> 56.005 72.734 53.233 1
1365 HETATM  1366    N1 <NA>   GDP     A   274   <NA> 55.907 74.049 55.053 1
1366 HETATM  1367    C2 <NA>   GDP     A   274   <NA> 56.436 74.939 55.901 1
1367 HETATM  1368    N2 <NA>   GDP     A   274   <NA> 55.747 75.129 57.025 1
1368 HETATM  1369    N3 <NA>   GDP     A   274   <NA> 57.574 75.569 55.662 1
1369 HETATM  1370    C4 <NA>   GDP     A   274   <NA> 58.110 75.138 54.493 1
         b segid elesy charge
1341 20.79  <NA>  <NA>   <NA>
1342 31.08  <NA>  <NA>   <NA>
1343 30.69  <NA>  <NA>   <NA>
1344 24.13  <NA>  <NA>   <NA>
1345 29.87  <NA>  <NA>   <NA>
1346 29.39  <NA>  <NA>   <NA>
1347 32.94  <NA>  <NA>   <NA>
1348 38.15  <NA>  <NA>   <NA>
1349 39.73  <NA>  <NA>   <NA>
1350 37.11  <NA>  <NA>   <NA>
1351 37.93  <NA>  <NA>   <NA>
1352 36.58  <NA>  <NA>   <NA>
1353 40.35  <NA>  <NA>   <NA>
1354 36.25  <NA>  <NA>   <NA>
1355 38.09  <NA>  <NA>   <NA>
1356 39.40  <NA>  <NA>   <NA>
1357 43.75  <NA>  <NA>   <NA>
1358 40.06  <NA>  <NA>   <NA>
1359 39.43  <NA>  <NA>   <NA>
1360 38.59  <NA>  <NA>   <NA>
1361 38.35  <NA>  <NA>   <NA>
1362 35.85  <NA>  <NA>   <NA>
1363 37.59  <NA>  <NA>   <NA>
1364 39.03  <NA>  <NA>   <NA>
1365 38.56  <NA>  <NA>   <NA>
1366 36.65  <NA>  <NA>   <NA>
1367 34.76  <NA>  <NA>   <NA>
1368 37.24  <NA>  <NA>   <NA>
1369 37.60  <NA>  <NA>   <NA>

pqr$xyz[inds$xyz]

 [1] 65.614 76.977 46.715 62.667 77.781 47.505 61.587 77.413 46.626 63.294
[11] 79.098 47.336 63.804 76.731 47.410 62.281 77.644 49.012 62.781 76.563
[21] 50.116 64.200 76.858 50.463 62.459 75.187 49.671 61.927 76.929 51.222
[31] 61.690 78.290 51.572 61.260 78.393 53.002 59.989 77.748 53.185 62.181
[41] 77.747 54.015 62.291 78.499 55.179 61.548 76.420 54.295 61.846 76.085
[51] 55.643 60.078 76.792 54.224 59.258 75.630 53.844 59.255 75.041 52.612
[61] 58.334 74.158 52.460 57.550 74.278 53.590 56.499 73.638 53.877 56.005
[71] 72.734 53.233 55.907 74.049 55.053 56.436 74.939 55.901 55.747 75.129
[81] 57.025 57.574 75.569 55.662 58.110 75.138 54.493


## See the help page for atom.select() function for more details.

Author

Barry Grant

Read PQR File