Running a job on Comet – using the queue manager called SLURM

You tell comet to run your calculation by submitting it to a queue. Your calculation waits in line with all the other jobs scientists want to run on comet. The software that controls this queue is called slurm (that’s a really dumb name, but so it is).

The basics are that first you make (or edit) a bash script containing all the slurm settings and Linux commands you need.  Let’s say you call that file calculation.sh Then you can submit your job to the queue by typing:This ain't no cartoon

sbatch calculation.sh

If it works, you should see a short message with the job’s ID number.   Your job might not start right away if there are a lot of either jobs ahead of yours in line.

To view all the jobs in the queue, you can just type

squeue

To view only the jobs you’ve submitted, you can just type

squeue -u yourusername

If you realize — whoops! — you made a mistake and want to cancel the job you’ve just submitted, you can type

scancel JobIDNumber

When your job is done, it will silently disappear from the queue and your output files should be in your directory.  You can put a setting in your calculation.sh script to email you when your job finishes.  If something went wrong with the calculation, your output files should contain error messages to help you figure out what went wrong so you can fix it and resubmit the job.

That’s it for the basics.  There are some more useful slurm commands you can read about on the slurm official documentation page.  In another blog post, I’ll show you a simple slurm script you can copy, paste, and edit to get your own jobs running smoothly.

1 Comment

Filed under computing

One Response to Running a job on Comet – using the queue manager called SLURM

  1. Pingback: Writing scripts to submit jobs on comet using the slurm queue manager | Computational Chemistry at Skidmore College

Leave a Reply