Tutorial

This tutorial will walk you through using ELECTRIC to analyze the electric field in a small protein. This tutorial assumes you have ELECTRIC and MDI-enabled Tinker installed. If you don’t, navigate to the installation instructions.

Note

This tutorial will assume the following:
  • You are able to run a molecular dynamics simulation using the Tinker software, or are familiar enough with molecular dynamics to follow along.

  • You have installed ELECTRIC and MDI-enabled Tinker. If you have not, see the installation instructions.

  • You are able to download or clone a directory from git.

  • You are familiar with bash scripts.

To begin the tutorial, first make sure you have your electric environment activated:

conda activate electric

The pdb code for this protein is 1l2y, and you can see the structure below. We have chosen a small protein for demonstrative purposes. The image below shows only the protein, but our simulation is solvated.

Running an ELECTRIC calculation

Preparing Files

To follow along with this tutorial, clone the tutorial repository. Included in this repository is folder called data. The data directory has all of the data you will need for this tutorial.

ELECTRIC is a post-processing analysis, meaning that you should first run your simulations using the AMOEBA forcefield and save the trajectory. After you have a trajectory from Tinker simulation, use ELECTRIC to perform electric field analysis on that trajectory. We will be analyzing the trajectory included in the tutorial repository, 1l2y_npt.arc. This trajectory is a text file containing coordinates for a simulation that has already been run.

To get started with ELECTRIC, we will need to prepare our input files. We will need:
  • a molecular dynamics trajectory

  • a Tinker input file (usually called a key file) which does not have settings for periodic boundaries or Ewald summation

  • the forcefield parameter file

  • a bash script file

We already have the molecular dynamics trajectory, so let’s look at each of these additional files.

Simulation Parameter File

This file contains the force field parameters for the simulation you have run. If you are using this software for analysis, use the same force field you used to run the molecular dynamics simulation. For this tutorial, the parameter file is amoebabio18.prm. We will need this for our Tinker input file (next section).

Tinker input file

Next, we must prepare an input file which tells Tinker settings for our calculation. This input file should be a modified version of the one which you used to run your initial simulation. Consider the input file, tinker.key used to obtain this trajectory. The parameter file in the previous step is given on line 1.

 1parameters amoebabio18.prm
 2openmp-threads 16
 3
 4a-axis 50.00
 5b-axis 50.00
 6c-axis 50.00
 7
 8polar-eps 0.000010
 9polar-predict
10polarization mutual
11
12
13cutoff 10.0
14ewald
15neighbor-list
16integrator beeman
17
18thermostat nose-hoover
19barostat nose-hoover
20
21maxiter 8000
22printout 1000

The input file used for this simulation uses periodic boundaries and an Ewald summation for electrostatics. During a Tinker simulation using AMOEBA, electric fields are evaluated in order to calculate the induced dipoles at each step. In order to get electric field contributions from specific residues, we must calculate the electric field using the real space interactions only (no periodic boundaries or Ewald).

Remove settings related to cutoffs (cutoff keyword), periodic boundaries (a-axis, b-axis,:code:c-axis) and Ewald summation (ewald). You can also remove settings having to do with neighbor lists (neighbor-list), as they are not needed and can cause an error for this calculation if included.

The modifed input file for ELECTRIC is given below. This file is saved in the data directory with the name noewald.key.

parameters amoebabio18.prm
openmp-threads 16

polar-eps 0.000010
polar-predict
polarization mutual

integrator beeman

thermostat nose-hoover
barostat nose-hoover

maxiter 8000
printout 100

Bash script - run_analysis.sh

When you run analysis uisng ELECTRIC, ELECTRIC parses your given trajectory sends snapshots to Tinker for electric field calculation. The MDI-enabled version of Tinker then calculates the electric field information for that snapshot.

You use ELECTRIC from the command line. Consider the following bash script provided for analysis, run_analysis.sh. We will explain this script in detail.

 1#location of required codes
 2DRIVER_LOC=LOCATION/TO/ELECTRIC/ELECTRIC.py
 3TINKER_LOC=LOCATION/TO/DYNAMIC/dynamic.x
 4
 5#remove old files
 6if [ -d work ]; then
 7rm -r work
 8fi
 9
10#create work directory
11cp -r data work
12cd work
13
14#set the number of threads
15export OMP_NUM_THREADS=2
16
17#launch MDI enabled Tinker
18${TINKER_LOC} 1l2y -k no_ewald.key -mdi "-role ENGINE -name NO_EWALD -method TCP -port 8022 -hostname localhost"  10 1.0 0.002 2 300.00 > no_ewald.log &
19
20#launch driver
21python ${DRIVER_LOC} -snap 1l2y_npt.arc -probes "93 94" -mdi "-role DRIVER -name driver -method TCP -port 8022" --byres 1l2y_solvated.pdb  --equil 120 --stride 2 &
22
23wait

Note

For this tutorial, we use the approach of having all data needed for analysis in a directory called data. During analysis, we copy everything from data into a folder work. This part of the tutorial is stylistic. The authors prefer this method to keep files separated, and original files unaltered.

In lines 2 and 3, you should change the location to your installed ELECTRIC.py file and MDI-enabled dynamic.x. Recall from the installation instructions that you can find these in the ELECTRIC directory in the files ELECTRIC/test/locations/ELECTRIC and ELECTRIC/test/locations/Tinker_ELECTRIC.

The next section removes the folder called work if it exists. This bash script is written to put all analysis files into a folder called work to keep our original files clean.

MDI-enabled Tinker is launched on line 18 with the command

${TINKER_LOC} 1l2y -k no_ewald.key -mdi "-role ENGINE -name NO_EWALD -method TCP -port 8022 -hostname localhost"  10 1.0 0.002 2 300.00 > no_ewald.log &

The first thing on this line, ${TINKER_LOC} fills in the location for dynamic.x which you put in line 2. Next, 1l2y is the file name (without an extension) of the xyz file for this calculation (provided vile 12ly.xyz). You should have this from your original simulation. However, make sure that there is no box information on line two of this xyz file, as this could cause Tinker to use periodic boundaries. Next, we give the input file (key file) we have prepared in the previous step using -k noewald.key. Then, we give our MDI options. The given options should work for most analysis. After the MDI options are some Tinker input options. For our analysis, it will not really matter what we put here since we are running calculations on one snapshot at a time. However, you must have these present for Tinker to run. Very importantly, note the ampersand (&) at the end of this line. This will launch Tinker in the background, where it will be waiting for commands from ELECTRIC.

Warning

Make sure that there is no box information on line two of the xyz file used to launch MDI-enabled Tinker. This could cause Tinker to use periodic boundaries.

In the next command (line 21), we launch ELECTRIC.

python ${DRIVER_LOC} -snap 1l2y_npt.arc -probes "78 93 94"  -mdi "-role DRIVER -name driver -method TCP -port 8022" --byres 1l2y_solvated.pdb  --equil 120 --stride 2 &

Here, we first give the location of our ELECTRIC driver. We indicate our trajectory file using the -snap argument with the filename to analyze, followed by MDI options.

Probe Atoms

To run an ELECTRIC calculation, you must give the indices of your probe atoms. The probe atoms are the atoms which are used as ‘probes’ for the electric field. ELECTRIC reports the projected total electric field at the midpoint between all probe atom pairs. This allows you to calculate electric fields along bonds as reported in literature.

You should obtain the number of the probe atoms from the xyz file you use to launch MDI-enabled Tinker. Note that the index you use here should match the number given in the first column of your xyz file. The projection of the electric field at the midpoint of these two atoms will be reported for each analyzed frame. If you indicate more than two probes, all pairwise fields will be reported (ie, if using “78 93 94”, you will get “78 and 93”, “78 and 94” and “93 and 94”). You can see the atoms we have chosen as probes highlighted below:

The argument –byres gives information to ELECTRIC about how we would like the electric field reported. When we use the --byres argument, it should be followed by a pdb which contains residue information for the system you are studying. When using this argument, electric field contributions from each residue will be reported. Other options are --byatom top report electric field contributions from each atom, and --bymol to report electric field contributions from each molecule.

When using --byres, solvent should be at the end of the pdb and xyz files. Solvent (ions and water) will be grouped together into a single residue.

Warning

When using the byres option, you should verify that the residues in your pdb file match what you expect for your xyz file. You can do this with the utility function residue_report.py. ELECTRIC will check that the xyz and pdb have the same number of atoms. However, all residue information will come from the PDB, so make sure the residue information in your provided PDB is as you expect.

Note

The utility script residue_report.py is provided in the same directory as ELECTRIC.py. To use it,

python residue_report.py PDB_FILENAME

This will output a report which gives the residue number, the atom index on which the residue starts and the residue name. When using --byres, you should first verify that your pdb file has residues defined as you want and matches your xyz file and trajectory. ELECTRIC only checks that the pdb and xyz file have the same number of atoms, it does not check atom identity or order. For this tutorial, our output is

Found 12199 atoms and 21 residues.
Residue Number       Starting atom        Residue Name
        1                    1                   ASN
        2                    17                  LEU
        3                    36                  TYR
        4                    57                  ILE
        5                    76                  GLN
        6                    93                  TRP
        7                   117                  LEU
        8                   136                  LYS
        9                   158                  ASP
        10                  170                  GLY
        11                  177                  GLY
        12                  184                  PRO
        13                  198                  SER
        14                  209                  SER
        15                  220                  GLY
        16                  227                  ARG
        17                  251                  PRO
        18                  265                  PRO
        19                  279                  PRO
        20                  293                  SER
        21                  305                solvent

Finally, we give arguments which gives information about the frame we want to analyze. Using –equil 120 tells ELECTRIC to skip the first 120 frames for analysis, and --stride 2 tells ELECTRIC to analyze every other frame after 120.

Running the calculation

After you have prepared your files, you can run analysis using the command

./run_analysis.sh > analysis.out &

This will launch ELECTRIC. Again, using the ampersand & will run this in the background. Now, you just have to wait for your analysis to finish running.

Analyzing Results from ELECTRIC

ELECTRIC will output a csv file with the electric field information proj_totfield.csv in the work folder. Below, we show results (numbers rounded for clarity) for probes 78 and 93 from proj_totfield.csv. When these numbers are reported, they are the electric field in Mv/cm projected along the vector pointing from atom 1 to atom 2 due to each residue.

78 and 93 - frame 122 78 and 93 - frame 124 78 and 93 - frame 126 78 and 93 - frame 128 78 and 93 - frame 130
residue 1 -2.395 -4.253 -2.878 -3.259 -4.556
residue 2 19.443 14.662 25.377 24.813 23.909
residue 3 9.765 18.434 16.398 19.663 10.279
residue 4 -8.864 0.499 -3.138 -5.830 -4.440
residue 5 9.338 1.112 -0.795 -1.922 -1.382
residue 6 -10.355 -9.994 -11.356 -15.802 -7.026
residue 7 1.301 -1.442 -2.552 -3.883 -4.978
residue 8 42.529 33.426 39.153 34.288 41.881
residue 9 -33.997 -21.626 -16.939 -29.022 -23.161
residue 10 0.040 0.292 -0.031 -0.242 0.084
residue 11 -2.078 -1.582 -2.987 -2.321 -2.687
residue 12 -0.029 -0.007 0.074 -0.167 0.032
residue 13 -0.103 0.085 0.269 0.445 0.053
residue 14 -1.414 -1.112 -2.235 -0.719 -0.887
residue 15 -0.882 -0.799 -0.644 -0.667 -0.819
residue 16 -1.735 -1.047 -3.305 -0.146 0.879
residue 17 2.504 2.084 2.682 2.390 3.073
residue 18 -1.195 -1.988 -1.342 -1.743 -1.556
residue 19 -1.848 -3.031 -1.783 -2.240 -2.890
residue 20 7.817 8.155 7.651 7.944 8.955
residue 21 12.086 4.212 -6.339 1.039 -4.052

You are free to analyze this as you like, but we recommend using pandas to process the csv file. A script to perform averaging of probe pairs across frames is provided in ELECTRIC/sample_analysis/calculate_average.py. For example, you can run this script

python PATH/TO/calculate_average.py -filename work/proj_totfield.csv

This will output a file with the average projected field for each residue pair. In our case, three files should be output: 78 _and_93.csv, 78_and_94.csv, and 93_and_94.csv. The output for the 78_and_93.csv is shown in the table below:

residue 78 and 93 standard deviation
1.000 -3.468 0.914
2.000 21.641 4.547
3.000 14.908 4.614
4.000 -4.354 3.448
5.000 1.270 4.653
6.000 -10.907 3.176
7.000 -2.311 2.421
8.000 38.255 4.221
9.000 -24.949 6.649
10.000 0.029 0.193
11.000 -2.331 0.544
12.000 -0.019 0.091
13.000 0.150 0.212
14.000 -1.273 0.597
15.000 -0.762 0.103
16.000 -1.071 1.588
17.000 2.546 0.366
18.000 -1.565 0.315
19.000 -2.359 0.579
20.000 8.104 0.510
21.000 1.389 7.277