Pdb2group takes a PDB file with CONECT records (mandatory) and generates a "group file" suitable for use with the commands addgrp, swapaa, and swapna. The group file is written to standard output.
Since pdb2group generates group files, it is not likely that the user will need to create one manually. However, some knowledge of the file contents is helpful for understanding the pdb2group command.
A group file consists of a series of text lines, and can be divided into four parts. The first part is the title and is a single-line description of the group. The second part is the list of atoms in the group and is a series of lines of the form
new atom_name x y zEach atom_name should be in uppercase letters but does not have to be unique. The coordinates (x y z) specified in these lines are ignored in most cases, and may be set to (0, 0, 0) if they are not available. The third part of file group file is a separator and is a single line:
read internal coordinates for new groupThe fourth and final part of the group file describes connectivity and internal coordinates, and is a series of lines of the form
mode atom1 atom2 atom3 atom4 bond_length bond_angle dihedral_angleThe mode is either + or = and is generally ignored. Each of the four atom fields may be either a number or n3, n2, or n1. If it is a number, it refers to an atom in the group. The atoms listed in the group are numbered sequentially, with the first atom being 0. If the atom field is n3, n2, or n1, it refers to an atom in the molecule to which the group will be attached. Atom 0 is always attached to atom n3. Bond_length is the distance between atom1 and atom2. Bond_angle is the angle formed by atom1, atom2, and atom3, with atom2 as the vertex. Dihedral_angle is the angle formed by all four atoms, with atom2 and atom3 as the internal vertices.
The following group file, lys.group, was generated as described in the EXAMPLE section:
lysine_sidechain new CB 0.000 0.000 0.000 new CG 0.000 0.000 0.000 new CD 0.000 0.000 0.000 new CE 0.000 0.000 0.000 new NZ 0.000 0.000 0.000 read internal coordinates for new group + 1 0 n3 n2 1.53 111.0 -68.9 + 2 1 0 n3 1.53 111.0 -178.4 + 3 2 1 0 1.53 111.0 180.0 + 4 3 2 1 1.48 110.5 180.0
The required command-line arguments to pdb2group are the names of anchor_atom, n3_atom, and n2_atom. The anchor_atom is the first atom to be listed in the output group file. N3_atom and n2_atom refer to atoms in PDB_file and are used to compute bond and dihedral angles between the group and the rest of the molecule; they will not appear in the group file. Atoms in PDB_file which are connected to anchor_atom only through n3_atom will not appear in the group file. PDB_file must contain only one residue.
If given, group_description is used as the group file title; otherwise, PDB_file is the title unless the PDB file is read from standard input, in which case the title is -.
Although lysine is provided as a standard group, the creation of a group file for a lysine sidechain is described as an example.
Creating the PDB input file by hand is a tedious procedure, but may be necessary if you have no PDB file containing the desired group and no convenient way to generate one. Remember that it is necessary to include CONECT records for all atoms in the file.
A much more pleasant alternative, if you have a PDB file that includes the group of interest, is to use pdbrun to make the PDB input file. In Chimera, limit the display to only one lysine residue, and then run the command:
pdbrun conect nouser cat > lys.pdb
The anchor_atom specified in the command line should be the first added atom of the new group when the group is added to an existing structure. N3_atom and n2_atom are atoms that are not in the new group, but that the new group connects to. N3_atom connects directly to the anchor_atom, and n2_atom connects to n3_atom.
In the lysine example, an appropriate pdb2group command would be:
pdb2group -a CB -3 CA -2 C -d lysine_sidechain lys.pdb > lys.groupIn this case, although the entire residue (including the backbone atoms N,CA,C,O) is present in lys.pdb, only the sidechain (from CB on out) is placed in the output group file.