Estimating job resources
Requesting accurate resources for a job matters in two directions: request too little and the job is killed before it finishes; request too much and the job sits longer in the queue and your allocation is depleted faster. This page describes a practical workflow for arriving at good estimates before committing to a production run.
The workflow
- Run a small test using a reduced input or a short simulation segment with
--qos=test - Measure actual usage with
sacctonce the test completes - Scale up your estimates based on the test results
- Add a margin to account for variability, then submit the full job
Estimating walltime
Step 1 — time a small test run
Wrap your program in time to measure how long it takes on a representative small input:
The time output appears in your job's output file:
real is the walltime — the elapsed clock time.
Step 2 — scale to your full input
Walltime often scales roughly linearly with input size. If your test input is 1/10th the size of the full dataset and took 3m 42s, the full run will take approximately 37 minutes.
For iterative methods (simulations, optimisers), the number of steps is usually the scaling factor rather than input size.
Step 3 — check elapsed time with sacct
After the test job, confirm the actual elapsed time:
Step 4 — add a margin
Add 20–30% to your scaled estimate. This absorbs variability from filesystem load, memory pressure, and other cluster conditions. Avoid adding excessive margin — jobs requesting less walltime queue faster because SLURM can fit them into gaps in the schedule.
Estimating memory
Default allocation
The default memory allocation on COSMOS is 5300 MB per core. If your job does not request more, this is what it gets. For many jobs this is sufficient.
Step 1 — run a test and check peak memory
After any job completes, sacct reports the peak memory used (MaxRSS):
Example output:
JobID MaxRSS ReqMem Elapsed
------------ ---------- ---------- ----------
12345 5300Mc 00:03:44
12345.batch 3276800K 5300Mc 00:03:44
MaxRSS is reported in kilobytes on the .batch step line. Convert to MB:
Step 2 — calculate per-core memory
Divide total peak memory by the number of cores your job used, then add a 20% margin:
If the result is below 5300 MB/core the default is sufficient and no --mem-per-cpu directive is needed. If it exceeds 5300 MB/core, request it explicitly:
Memory and billing
Requesting more than 5300 MB per core leaves some cores idle on the node. Your allocation is charged for those idle cores too. Only request what you actually need.
Choosing a core count
Serial jobs
If your program is single-threaded, request 1 core. Requesting more does not speed it up and wastes allocation.
OpenMP (shared memory)
OpenMP jobs scale within a single node. Performance typically improves up to 8–16 cores for most codes, with diminishing returns beyond that. Test with 4, 8, and 16 cores before committing to a full 48-core request:
MPI (distributed memory)
MPI jobs can span multiple nodes. Before a large run, test scaling efficiency by comparing runtimes at 1, 2, and 4 nodes. If doubling the cores does not roughly halve the runtime, the job does not scale well and requesting more nodes wastes allocation.
Putting it together
A typical resource estimation workflow:
#!/bin/bash
# Step 1: small test run
#SBATCH -t 00:30:00
#SBATCH -J resource_test
#SBATCH --qos=test
#SBATCH -N 1
#SBATCH --ntasks-per-node=8
module purge
module load <your-modules>
time ./myprogram small_input.dat
After the test completes:
Use the results to set your production job's walltime and memory, then remove --qos=test before submitting.
Author: (LUNARC)
Last Updated: 2026-03-30