Running Fluent

Firstly follow the instructions in ‘III. Create a HPC cluster’ to SSH to the cluster:

 pcluster ssh cfd -i cfd_ireland.pem

Move simulation file from S3 bucket to the working directory

First of all lets copy a Fluent case you may already have created previously. Follow the instructions in ‘Creating a S3 Bucket’ to upload the case to S3.

 cd /fsx

Next lets copy from S3 an Fluent setup that you’ve uploaded.

 aws s3 cp s3://yourbucketname/fluentcase.tgz .

Lets untar the case:

 tar -xf fluentcase.tgz

As you have a small head node of just 2 vCPUs and 4GB RAM, you want to do the meshing and solving via the compute or mesh queue. Lets explore how to do this.

Create a Slurm submit script

We have created a HPC cluster using AWS ParallelCluster that is based upon the Slurm scheduler. You can read more about Slurm here but in essence it allows us to submit the case to an arbitary number of cores and Slurm/AWS ParallelCluster will do the scheduling for you.

The first step is to run a submission script. An example setup script is provided in submit.sh

 vi submit.sh
#!/bin/bash
#SBATCH --job-name=foam-72
#SBATCH --ntasks=72
#SBATCH --output=%x_%j.out
#SBATCH --partition=compute
#SBATCH --constraint=c5n.18xlarge

export LD_LIBRARY_PATH=/opt/slurm/lib:$LD_LIBRARY_PATH
module load openmpi
export OPENMPI_ROOT=/opt/amazon/openmpi
export ANSYSLI_SERVERS=2325@XXX
export SERVER=1055@xxx
export ANSYSLMD_LICENSE_FILE=1055@xxx
NODEFILE="$SLURM_SUBMIT_DIR/slurmhosts.$SLURM_JOB_ID.txt"
scontrol show hostname $SLURM_NODELIST >$NODEFILE
/fsx/ansys_inc/v202/fluent/bin/fluent 3d -i journalfile.jou -t${SLURM_NPROCS} -g -cnf=$NODEFILE -ssh -mpi=openmpi

In this script, we specify the number of cores e.g tasks to be 72 (x2 c5n.18xlarge). The partition line refers to the queues we created previously e.g compute and mesh. The constraint part is only needed if you have multiple compute options for each queue, however we could remove it given we have only one per queue.

The rest of the script is just to load the MPI version and source the Fluent installation and specify your Fluent license IP. Finally we have typically Fluent commands where the number of cores is taken from the SBATCH lines at the top of the script.

Submitting jobs to the scheduler

Use sbatch command to submit the first script.

sbatch submit.sh

Other useful Slurm commands:

  • squeue – shows the status of all running jobs in the queue.
  • sinfo – shows partition and node information for a system
  • srun – run an interactive job
  • scancel jobid – kill a Slurm job

If you type squeue you should initially see the following which states that it’s in a queue.

[ec2-user@ip-10-0-0-22 fluentDemo]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON) 
                 7   compute  fluent-72 ec2-user CF       0:03      2 compute-dy-c5n18xlarge-[1-2]

In the back-end this is now sending a signal for EC2 instances to be created (which you can see via your EC2 console). It should take around 4-5 minutes for these to be launched. If you do not see this move to ‘R’ for running within 5 minutes, check your EC2 console and make sure you have requested an increase to your Service Quotas as described in previous sections.

You should now see:

[ec2-user@ip-10-0-0-22 fluentDemo]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 7   compute  fluent-72 ec2-user R       0:05      2 compute-dy-c5n18xlarge-[1-2]

You can tar up these results (e.g .cas and .dat) and bring them back to your machine.

Alternatively you can just bring back some pictures/videos that were automatically generated.

 tar -czvf results.tgz *.cas *.dat

We can then copy it to S3:

 aws s3 cp results.tgz s3://bucketname/

When you are finished make sure you then do the following, where the JOBID, can be seen by firstly typing ‘squeue’. If its empty then nothing is running:

 scancel JOBID

This was just a basic example of Fluent but hopefully it gives you an example of how it can be used on AWS.