Running STAR-CCM+

There are two ways to run STAR-CCM+; batch mode and server mode. For both because we have provisioned a small head node (c5n.large) with only 2 vCPU, we won’t use it run simulations but rather to submit jobs and if needed connect to jobs running on the compute nodes (although a more optimum method is described to connect to running jobs in the next section). Lets go through an example.

Firstly follow the instructions to connect to the cluster in the previous section e.g

 pcluster dcv connect cfd --key-path cfd_ireland.pem --show-url

Move simulation file from S3 bucket to working directory

First of all lets copy a sim file you may already have created previously. Navigate to the working directory i.e /fsx/ on the master node. An example 4M cell case can be found here together with two STAR-CCM+ java macro’s to mesh and run the case. It’s unmeshed so is only 2.3MB, and we therefore need to mesh and run it. Follow the instructions in ‘Section2: Creating a S3 Bucket’ to upload the case to S3.

 cd /fsx

Next lets copy from S3 a sim file (e.g F1WingFine.tgz) that you’ve uploaded.

 aws s3 cp s3://yourbucketname/F1WingFine.tgz .

Then untar to give the F1WingFine.sim and mesh.java & run.java

 tar -xf F1WingFine.tgz

Using the alias you created in the previous section you can type the following to launch the application:

 starccm

You can then proceed to select the file from /fsx and enter your STAR-CCM+ POD key. Please reach out your Siemens support team or sales team for licensing discussions.

install-2

You can open the file but because you have a small head node of just 2 vCPUs and 4GB RAM, you want to do the meshing and solving via the compute or mesh queue. Lets explore how to do this. As we’ll be copying and pasting commands this may be easiest via SSH so you can head to AWS CloudShell (see section Section 3 for details) to connect to the cluster like this:

 pcluster ssh cfd -i ~/cfd_ireland.pem

Create a Slurm submit script

We have created a HPC cluster using AWS ParallelCluster that is based upon the Slurm scheduler. You can read more about Slurm here but in essence it allows us to submit the case to an arbitary number of cores and Slurm/AWS ParallelCluster will do the scheduling for you.

The first step is to run a submission script. An example set is shown below that works with 15.06.007 for meshing and solving - you may wish to customize it based upon your particular needs using a text editor of your choice e.g VI.

 vi submit-mesh.sh
#!/bin/bash
#SBATCH --job-name=mesh-48
#SBATCH --ntasks=48
#SBATCH --output=%x_%j.out
#SBATCH --partition=mesh

/fsx/STAR-CCM+/15.06.007/STAR-CCM+15.06.007/star/bin/starccm+ \
        -bs slurm \
        -power \
        -batch mesh.java \
        -podkey xxxx \
        -licpath 1999@flex.cd-adapco.com \
        F1WingFine.sim

and for the solve

 vi submit-solve.sh
#!/bin/bash
#SBATCH --job-name=solve-72
#SBATCH --ntasks=72
#SBATCH --output=%x_%j.out
#SBATCH --partition=compute

/fsx/STAR-CCM+/15.06.007/STAR-CCM+15.06.007/star/bin/starccm+ \
        -bs slurm \
        -power \
        -batch run.java \
        -podkey xxxx \
        -licpath 1999@flex.cd-adapco.com \
        F1WingFine@meshed.sim

In this script, we specify the number of cores e.g tasks to be 72 (x2 c5n.18xlarge). The partition line refers to the queues we created previously e.g compute and mesh. Make sure the .java files are both in the same directory as the sim file. For such a small case we wouldn’t actually need to mesh on a separate queue with an instance with extra RAM but this is just to show you an example.

Submitting jobs to the scheduler

Use sbatch command to submit the first mesh script.

sbatch submit-mesh.sh

Other useful Slurm commands:

  • squeue – shows the status of all running jobs in the queue.
  • sinfo – shows partition and node information for a system
  • srun – run an interactive job
  • scancel jobid – kill a Slurm job

If you type squeue you should initially see the following which states that it’s in a queue.

[ec2-user@ip-10-0-0-22 f1fine]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 4      mesh  mesh-48 ec2-user  CF       0:05      1 mesh-dy-m524xlarge-1

In the back-end this is now sending a signal for EC2 instances to be created (which you can see via your EC2 console). It should take around 4-5 minutes for these to be launched. If you do not see this move to ‘R’ for running within 5 minutes, check your EC2 console and make sure you have requested an increase to your Service Quotas as described in previous sections.

You should now see:

[ec2-user@ip-10-0-0-22 f1fine]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 4      mesh  mesh-48 ec2-user  R       0:05      1 mesh-dy-m524xlarge-1

If you look at the output e.g mesh-48_1.out, you should see it start to mesh, which takes around 2 minutes and then saves a file (if you’re using the example sim file) called F1FineWing@meshed.sim

We can submit the second script (to provide an example of using the different queues setup on the cluster)

 sbatch submit-solve.sh

Where after you should see something like the following:

[ec2-user@ip-10-0-0-22 f1fine]$ squeue
             JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
                 5      compute  solve-48 ec2-user  CF       0:05      2 compute-dy-c518xlarge-1

The mesh is approximately 4M cells in this example and therefore running across 2 nodes is probably already at the limit of scalability. You can see more information about the scaling of STAR-CCM+ on AWS in the benchmark section. Once the job starts it should only take less than 5 minutes to run and produce several output .png’s of the flowfield, a .csv of the forces and the results file: F1WingFine@00110.sim.

You can tar up these results and bring them back to your machine. This is a typical workflow where end-users only need to bring back the end post-processing results rather than the whole simulation files.

If you do want to do some extra post-processing, please see the next section on ‘server mode’.

 tar -czvf results.tgz forces.csv *.png

We can then copy it to S3:

 aws s3 cp results.tgz s3://bucketname/

Connecting to the job (server mode)

With the current cluster configuration (with a public head node and private compute nodes) you start STAR-CCM+ on the head-node and then use the ‘Connect to Server’ mode of STAR-CCM+ to connect to the jobs running either in server or batch mode.

It is outside the scope of this workshop to show other network options but with either a public compute node (mode 2. in the original pcluster setup) or using a VPN you could connect to a STAR-CCM+ job running on AWS from your local machine.

So lets give an example where you want to interactively configure a job by using the server mode (this is also how you can connect to a batch running job)

Lets first create a slightly modified submission script where the only difference is the ‘batch’ line has been changed to ‘server’

 vi submit-server.sh
#!/bin/bash
#SBATCH --job-name=solve-72
#SBATCH --ntasks=72
#SBATCH --output=%x_%j.out
#SBATCH --partition=compute

/fsx/STAR-CCM+/15.06.007/STAR-CCM+15.06.007/star/bin/starccm+ \
        -bs slurm \
        -power \
        -server \
        -podkey xxxx \
        -licpath 1999@flex.cd-adapco.com \
        F1WingFine@00110.sim

Proceed to submit this to the cluster:

 sbatch submit-server.sh

Wait several minutes for it to launch and then head to the output file e.g compute-72_6.sh, which should look something like the following:

Starting parallel server
License build date: 29 May 2020
This version of the code requires license version 2020.10 or greater.
Checking license file: 1999@flex.cd-adapco.com
MPI Distribution : Open MPI-4.0.3
Host 0 -- compute-dy-c5n18xlarge-1 -- Ranks 0-35
Host 1 -- compute-dy-c5n18xlarge-2 -- Ranks 36-71
Process rank 0 compute-dy-c5n18xlarge-1 17429
Total number of processes : 72

Simcenter STAR-CCM+ 2020.3 Build 15.06.007 (linux-x86_64-2.17/gnu9.2)

1 copy of ccmppower checked out from 1999@flex.cd-adapco.com
Feature ccmppower expires in 111 days
Sun Jan  3 16:37:14 2021

Server::start -host compute-dy-c5n18xlarge-1.cfd.pcluster:47827

Make a note of the hostname of the head node of the job, which for the example here is:

compute-dy-c5n18xlarge-1.cfd.pcluster

Where the port number is 47827.

You can then load up STAR-CCM+ by typing the following (assuming you made an alias for the STAR-CCM+ installation path in the previous STAR-CCM+ installation section).

 starccm

Then head to ‘File, Connect to Server’ and enter that hostname:

install-2

Alternatively, you can just launch using:

starccm -host compute-dy-c5n18xlarge-1.cfd.pcluster:47827

You can now do anything you want e.g open up some of the post-processing scenes, continue running etc. When you are finished you can just exit and accept the message for it to kill the job. If you wish to let it carry on you can instead just click ‘File, Disconnect’, but please remember to kill the job via ‘scancel’ when you are finished as described next.

When you are finished make sure you then do the following, where the JOBID, can be seen by firstly typing ‘squeue’. If its empty then nothing is running:

 scancel JOBID

This was just a basic example of STAR-CCM+ but hopefully it gives you an example of how to use several STAR-CCM+ use modes on AWS.