Using DustCluster to bring up high performance cluster infrastructure on AWS EC2

Sometimes you want to quickly bring up a high performance EC2 compute cluster with low latency interconnect for prototyping, developing, or benchmarking some custom distributed system/cluster software. When you have multiple such clusters, and you want to stop/start each cluster as a unit, and also perform parallel ssh operations on each one as a unit, the EC2 web console or awscli  and regular ssh can become unwieldy.

For this kind of use case, DustCluster can come in handy:

Dustcluster is a command line shell that lets you perform node operations and fast stateful ssh on named clusters of EC2 nodes. (Disclaimer: I’m its primary author)

It now has a plugin command that lets you bring up an EC2 cluster from a minimal spec (node names, instance types, count) and ssh into it with zero configuration. Behind the scenes it generates a fully configured CloudFormation stack based on this high level spec.

== Bringing up a cluster

1. install dustcluster

sudo pip install dustcluster

(works on Linux and Mac OS X only, no Windows support at this time)

2. Drop into the dustcluster shell

bash$ dust

If you dont have awscli intalled and configured, it will ask you for AWS credentials on the first use.

dust$ cluster new
Name this cluster: dev1
Number of nodes: 3
Node type [m4.large]: c4.xlarge
use placement group?: [y]  (enter)

dust$ cluster create dev1

Done.

With this, you will have running on EC2 :

  • An N node compute optimized cluster (type c4.xlarge)
  • In the region closest to you, running a recent Amazon Linux image
  • With security groups setup for intra cluster traffic and ssh from the outside
  • With ssh keys downloaded and configured for cluster-shh
  • With Enhanced Networking enabled, and in a placement group for low latency 10GBps interconnect
  • In a public subnet in a VPC with an Internet Gateway and Routing Tables configured (optional)

== Using the cluster

1. Check for cluster create completion

dust$ cluster status dev1

and optionally

dust$ show -v

and

dust$ refresh

2. Optionally use these cluster nodes as your working set

dust_shell$ use dev1

3 . Run commands over ssh on the worker nodes with:

dust_shell$ @worker* free -m

This runs the commad “free -m” on nodes named worker* and shows the output:

[worker0] total used free shared buffers cached
[worker0] Mem: 3767 498 3268 0 14 409
[worker0] -/+ buffers/cache: 75 3692
[worker0] Swap: 0 0 0
[worker0]

[worker1] total used free shared buffers cached
[worker1] Mem: 3767 498 3268 0 14 409
[worker1] -/+ buffers/cache: 75 3692
[worker1] Swap: 0 0 0
[worker1]

(Note that this is a stateful ssh connection: $cd /tmp  and then $pwd will return /tmp —    most cluster ssh tools out there run a single command in a single connection and disconnect. Huge difference! )

Or drop into a remote shell with:

@worker0

A default ssh key is automatically created for all DustCluster clusters. If you want to ssh into with the openSSH client you can find the key in ~/.dustcluster/keys

ssh -i ~/.dustcluster/keys/useast_dustcluster.pem ec2-user@x.y.z.r

== Notes

You can launch this cluster with low cost T2.Nano nodes by removing the use_placement_groups setting in the yaml.

== Stopping the cluster

Stop the cluster with:

stop cluster=dev1

Restart the nodes with

start cluster=dev1

Terminate all resources used by the cluster:

cluster delete dev1

== Cluster SSH operations

  • Ssh using node names:
@worker0 free -m
@worker* free -m
  • Ssh using filters:
@launch_time=*2016-04-11* free -m

everything from $show -vv is a filer-able field

  • Ssh over all nodes:
@ free -m
  • Filter nodes:
 show worker*
 show launch_time=*2016-04-11*
 show -v
 show -vv
  • Cluster node operations
stop worker*
start worker*
  • Switch between clusters and regions
 use myeucluster
 use myuseastcluster
use eu-central-1
use *

For more details, see the docs on github

== Check for Enhanced Networking

You can check if enhanced networking is enabled with:

dust$ @ ethtool -i eth0 | grep driver

If Enhanced Networking is enabled the driver is ixgbevf instead of vif


[master] driver: ixgbevf
[master]
[worker1] driver: ixgbevf
[worker1]
[worker0] driver: ixgbevf
[worker0]

 

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s