Tag Archives: comet

Helpful CPPTRAJ Commands Part 2: Action Commands

The next three commands that I found to be helpful are known as Action Commands. Unlike the Topology Commands from the previous post, these Action Commands allow for a file to be created that contains the output. These commands can be used both in CPPTRAJ’s interactive mode, and in .ptraj files that can be called as inputs when starting CPPTRAJ.


distance [<name>] mask1 mask2 [out <filename>]

This command outputs a file that gives the distance from the center of mass of mask1 to the center of mass of mask2 at each time frame (mask1 and mask2 can either be residues or atoms). The output file will show two columns, one indicating the time frame number, and the other indicating the distance between the specified masks in Angstroms. The following visual shows a sample output of the command being called on the CA and N specified atoms (Carbon and Nitrogen respectively).

The next visual is a graph of the first 500 distance values in Angstroms in a 5000 time frame production run.

This action command can utilize other key modifiers to get slightly different outputs; however, the ones listed above are the main modifiers that are necessary to get decent data. When running CPPTRAJ on my production data on Melittin, the modifiers listed above were the only ones that I used.

Example Usage: distance sample_name1 :19@CA :19@N out sample_data1.dat 


hbond [<name>] [out <filename>] [<mask>] [donormask <dmask> [donorhmask <dhmask>]] [solvout <filename>] [series [uvseries <filename>]]

This action command outputs a file that shows where there was hydrogen bonding in the molecule (or specified residue or atom). Some of these modifiers are not intuitively obvious and have yet to be seen in previous action commands, so below is the list of the modifiers that we have yet to identify in this post above and what they do.

  • [donormask <dmask> [donorhmask <dhmask]] refers to a specified residue or number of atoms that will be used as solute donor heavy atoms and a specified residue or number of atoms that will be used as solute donor hydrogen atoms respectively. The second mask should only be specified if the first mask is and the two masks should have a 1 to 1 correspondence between the two masks (in my case, one mask was specified as :WAT and the other was specified as :WAT@O, which represents the water box my Melittin was being simulated in).
  • [solvout <filename>] refers to the name of a file that will be outputted containing the averaged information of the solute-solvent hydrogen bonds in the specified [<mask>]. The output file will show the average distances and average angles of the hydrogen bonds formed between the acceptor and donor atoms (both of which are shown in the output file), along with the number of times a hydrogen bond formed and the fraction of the total number of hydrogen bonds that were specified by the [<mask>].
  • [series [uvseries <filename>]] refers to the name of a file that will be outputted containing the solute-solvent hydrogen bond time data in terms of whether a hydrogen bond was formed or not (as specified by a 1 meaning a hydrogen bond was formed, and a 0 meaning a hydrogen bond was not formed).

Normally, I would include sample data; however, the outputted data yielded very odd results that may or may not be completely useless, so I would much rather not give false data until I’m able to completely figure out how this command works. There are many other modifiers for this action command, but they turned out to be useless in my case.

Example Usage: hbond sample_name2 out sample_data2.dat solventdonor :WAT solventacceptor :WAT@O solvout sample_data3.dat series uvseries sample_data4.dat


rms/rmsd [<name>] <mask> [out <filename>]

This action command outputs a file that contains the RMS Deviation values of a specified <mask> at each time frame. The following visual shows a small part of a sample output of the RMS Deviation values of the backbone of Tryptophan in Melittin.

The next visual is a histogram of the 5000 RMS Deviation values that were outputted from using this command on the backbone of the Tryptophan in Melittin. The y-axis represents the fraction of RMS Deviation values that made up the data, while the x-axis represents the RMS Deviation values themselves.

Example Usage: rms sample_rms :19@C,CA,N out sample_rms.dat

Leave a Comment

Filed under chemistry, computing

Helpful CPPTRAJ Commands Part 1: Topology Commands

CPPTRAJ has a variety of commands for analyzing MD Simulation outputs, but because there are so many commands it would be very difficult to describe them all in detail in a single blogpost. As such, this first post will consist of the descriptions of three Topology Commands that I found to be useful for my research of the Tryptophan residue in Melittin. These Topology Commands print out Molecular Topology related information in CPPTRAJ’s interactive mode. For all intents and purposes, the visuals used in this blogpost will strictly be regarding Melittin.

Note: [<mask>] indicates a single residue/atom or range of residues/atoms


atominfo [<mask>]

This command prints out general information on each atom specified by the given [<mask>] modifier. The information is outputted into 12 columns. The following visual is an example output of the command being called on the 19th residue of a Melittin .mdcrd file.

  • #Atom refers to the atom’s index as given by Amber16
  • Name refers to the atom’s name identifier as given by Amber16
  • #Res refers to the residue number in which the atom is located
  • 2nd Name refers to the shorthand name of the residue in which the atom is located
  • #Mol refers to the atom’s molecule number
  • Type refers to the type of atom in the residue (i.e. alpha, beta, etc)
  • Charge refers to the electron charge of the atom
  • Mass refers to the mass of the atom
  • GBradius refers to the generalized Born radius of the atom
  • E1 refers to the element symbol
  • rVDW refers to the Van der Waal’s force radius of the atom
  • eVDW refers to the epsilon Van der Waal’s force of the atom

Example Usage: atominfo :19


bondinfo [<mask>]

This command prints out general information in the form of 6 columns about each bond between each atom as specified by the [<mask>] modifier. The following visual is an example output of the command being called on a specific carbon atom in the 19th residue of a Melittin .mdcrd file.

  • Bond refers to the bond index as specified by Amber16
  • Kb refers to the bond force constant
  • Req refers to the bond equilibrium value in Angstroms
  • atom names refers to the names of the bonded atoms as specified by Amber16
  • (numbers) refers to the atom indexes as specified by Amber16 as well as the types of atoms that are bonded together

Example Usage: bondinfo :19@CA


resinfo [<mask>]

This command prints out general information in 7 columns about a single residue or range of residues as specified by the given [<mask>] modifier. The following visual is an example output of this command being called on the range of residues that make up Melittin. The reason I had to indicate all the residues was because the .mdcrd file that I used had Melittin simulated in a water box. If the command was called without specifying the [<mask>] modifier, the output will include the residue info for all the water molecules in the water box as well. This goes for the atominfo and bondinfo commands too.

  • #Res refers to the residue index as specified by Amber16
  • Name refers to the shorthand name for each residue
  • First refers to the atom index of the first atom in the residue
  • Last refers to the atom index of the last atom in the residue
  • Natom refers to the number of atoms in the residue
  • #Orig refers to the original residue number in the original PDB file of which the date comes from
  • #Mol refers to the molecule number

One can also add the modifier, [short], to display the residues in the FASTA code sequence form. For Melittin, the sequence would look like this: GIGAVLKVLTTGLPALISWIKRKRQQ.

Example Usage: resinfo :1-26


Leave a Comment

Filed under chemistry, computing

Using Vi/Vim

Vi and Vim are ultimately the same thing. Vim is just the improved version of Vi. Vi/Vim is a text editor that is used in PuTTY when using comet.

Vim is needed when running molecular dynamics simulations because it allows you to quickly edit files and check results in output files. But, Vim isn’t your average text editor. Since it is designed for software like PuTTY, it makes editing very easy. The easy editing and the ability to edit your files in any way makes Vim very powerful.

Vim has many commands that contribute to how powerful the text editor is. Some of the helpful commands in Vim are :wq, i, u, G, gg, and :help cmd. Command, :wq, saves all the edits you have done to the file. Command, i, enables insert mode, which allows you to insert and delete things. Command, u, undoes previous action. Command, G, moves to the  end of the file. Command, gg, moves to the beginning of the file. Command, :help cmd, gives you help about whatever cmd is. cmd can be anything you need help with.

One amazing thing about Vim that is helpful and essential is vimtutor. Vimtutor is a multi-lesson tutorial on how to use and gives you important commands to use Vim. Each lesson have many parts to it. If you go through the entire tutorial, you will be about to effectively use Vim. There is a learning curve though. It takes a while to get use to, but after practice with Vim and using Vim, you’ll be a pro.


Above is the beginning of vimtutor.



1 Comment

Filed under computing

Accessing and Using Comet

Comet is the super computer center in San Diego that I’m using to run molecular dynamics simulations. In order to use the super computer in San Diego on your local computer, you have to use a ssh client and a scp client. Since I am wirking on a Windows computer, the ssh client I used was PuTTY and the scp client I used was WinSCP. There are different programs to use on other operating systems. All you have to do is sign into your comet account on PuTTY and WinSCP, and you have access to the things in your comet account and the super computer.

The ssh client, PuTTY, is the program that allows you to communicate directly to the super computer. PuTTY on your local computer connects the remote computer, comet. When you are typing in and navigating PuTTY, you are typing in and navigating comet. The actual molecular dynamics simulations are done in PuTTY.

The scp client, WinSCP, is the program that allows you to transfer files from comet to your local computer. You can easily transfer files back and forth, from local to emote computer in a short period of time.

To learn how to use comet, I had to learn how to use PuTTY and WinSCP. Also, I needed to learn basic linux commands because in PuTTY you use linux commands to control comet. Linux commands like mkdir, cd, cp, mv, and ls are really essential and used a lot. mkdir creates a directory, cd changes the directory, cp copies files, mv moves files, and ls lists the files in the current directory. I think the nohup command followed by the & at the end of job was most useful so far because it allows jobs to run in the background without hanging up. This allows you to run other jobs or do other things without stopping the job. WinSCP can be used like any other file manager program, no linux commands needed.

An example of the nohup command to run a trajectory:
nohup mpirun -np 4 sander.MPI -O -i mdin -p 3beq.top -c 3beq.crd -r 3beq_md.crd -x md.trj -o md.out &

Comet allows me access to AMBER, which lets me use tleap, sander, and cpptraj. Those are programs are essential to running molecular dynamics.

PuTTY and WinSCP

Comet is the black screen on the left and WinSCP is the left screen.


Filed under computing

Writing scripts to submit jobs on comet using the slurm queue manager

Comet is a huge cluster of thousands of computing nodes, and the queue manager software called “slurm” is what handles all the requests, directs each job to a specific node(s), and then lets you know when its done. In a prior post I showed the basic slurm commands to submit a job and check the queue.

You also need to write a special linux bash script that contains a bunch of slurm configurations, and also the linux commands to actually run your calculation.  This is easiest show by example, and I’ll show two:  one for a Gaussian job, and another for an AMBER job.

#SBATCH -t 10:00:00
#SBATCH --job-name="gaussian"
#SBATCH --output="gaussian.%j.%N.out"
#SBATCH --partition=batch
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --export=ALL

jobfile=YOURINPUTFILE  #assumes you use a gjf extension (which is added below)

. /etc/profile.d/modules.sh
module load gaussian
exe=`which g09`
export OMP_NUM_THREADS=$nprocshared
/usr/bin/time $exe < $jobfile > $outputfile

If you copy this file exactly and then modify just the important parts, you can submit your own Gaussian jobs quickly.   But its worth knowing what these commands do.  All the SBATCH commands are slurm settings.  Most important is the line with the “t” flag which sets the wall time you think the job will require.  (“Wall time” means the same thing as “real time”, as if you were watching the clock on your wall.)  If your job winds up going longer than your set wall time, then slurm will automatically cancel your job even if it didn’t finish—so don’t underestimate.   The other important slurm flags are for “nodes” and “n-tasks-per-node”, which set how many nodes you want the job to use, and how many processors (cores) per node you want your job to use, respectively.  For Gaussian jobs, you always want to use nodes=1.  You can start with tasks-per-node=1, and then try and increase it up to 12 (the max for comet) to see if your calculation can take advantage of any of Gaussian’s parallel processing algorithms.  (You would also need to add a %nprocshared=8 line to the .gjf file.  It doesn’t always work that well, meaning your simply wasting processing time that we have to pay for.)

The commands at the bottom are bash linux commands that set up some important variables, including the stem of the filename for your Gaussian job file.  Eventually the script then loads the Gaussian module, and with the last line, submits your job.

If you’re using AMBER for molecular dynamics simulations, here’s a simple slurm script you can copy:

#!/bin/bash -l
#SBATCH -t 01:10:00
#SBATCH --job-name="amber"
#SBATCH --output="oamber.%j"
#SBATCH --error="eamber.%j"
#SBATCH --partition="compute"
#SBATCH --nnodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --no-requeue
#SBATCH --mail-type=END
#SBATCH --mail-user=username@skidmore.edu

module load amber

ibrun sander.MPI -np 1 -O -i equil4.in -p 2mlt.top -c 2mlt_equil3.crd -r 2mlt_equil4.crd -ref 2mlt_equil3.crd -x 2mlt_equil4.trj -o mdout

The SBATCH commands do the same job of configuring slurm, though the “mail-type” and “mail-user” options show how you can have slurm email you when a job finishes. In this example, the last ibrun command is the one that actually runs AMBER (a sander job) and sets up all of its settings, input files, output files, and other necessities. Those must be changed for each new job.

Here is some specific information and examples for comet at the San Diego Supercomputer Center, as well as a complete list of all the slurm options you can use.

Leave a Comment

Filed under computing

Running a job on Comet – using the queue manager called SLURM

You tell comet to run your calculation by submitting it to a queue. Your calculation waits in line with all the other jobs scientists want to run on comet. The software that controls this queue is called slurm (that’s a really dumb name, but so it is).

The basics are that first you make (or edit) a bash script containing all the slurm settings and Linux commands you need.  Let’s say you call that file calculation.sh Then you can submit your job to the queue by typing:This ain't no cartoon

sbatch calculation.sh

If it works, you should see a short message with the job’s ID number.   Your job might not start right away if there are a lot of either jobs ahead of yours in line.

To view all the jobs in the queue, you can just type


To view only the jobs you’ve submitted, you can just type

squeue -u yourusername

If you realize — whoops! — you made a mistake and want to cancel the job you’ve just submitted, you can type

scancel JobIDNumber

When your job is done, it will silently disappear from the queue and your output files should be in your directory.  You can put a setting in your calculation.sh script to email you when your job finishes.  If something went wrong with the calculation, your output files should contain error messages to help you figure out what went wrong so you can fix it and resubmit the job.

That’s it for the basics.  There are some more useful slurm commands you can read about on the slurm official documentation page.  In another blog post, I’ll show you a simple slurm script you can copy, paste, and edit to get your own jobs running smoothly.

1 Comment

Filed under computing

Accessing Comet at SDSC

Don't let it hit youComet is the supercomputing cluster at the San Diego Supercomputing Center that we have been using to help us do calculations.  We have access to comet with support from XSEDE (thank you!)

Comet is accessed over the internet using a command-line interface on the server comet.sdsc.edu.  The basic program we use to access comet is called “ssh” (which is an acronym for “secure shell”).

On Mac OSX, you can go to Applications -> Utilities -> Terminal to open the command-line Linux interface on your mac.   To login to comet from your mac, in the Terminal program type: ssh wkennerl@comet.sdsc.edu Of course, use your own comet username in place of mine!

On Windows, there is no command-line Linux interface, so you need to use a separate program to connect to comet using ssh.  The classic program is called putty, and it is pre-installed on all Skidmore-owned Windows computers.  You may prefer to get a program with more features, for example, the Bitvise ssh client .  In either case, download and install a ssh program and tell it to access comet.sdsc.edu.

On either type of computer, after a few seconds, you will connect to comet and be prompted for your password.  After you type in your password, you will be looking at a Linux command line (specifically, it is a bash prompt) on a computer 3000 miles away from Saratoga Springs.  Cool, eh?

Now you can use comet for whatever calculation you need, using Linux commands, your input and output files (for gaussian or AMBER)

Leave a Comment

Filed under computing