Ark Cluster Setup
We have inherited 96 Dell PowerEdge 1850 Servers and created a cluster called ark. Each node has 4 x 3.31 Ghz CPUs, 3.86 GB RAM, 70 GB hard drive and runs CentOS 5.2. Ark cluster can be used in addition to other supercomputing resources available at TSRI. If you are within scripps.edu domain, click here to see Ark Cluster Report powered by Ganglia scalable distributed monitoring system. Note that node 92 is not powered due to a missing hard drive.
User account has been created for all current lab member on all nodes. Please use your TSRI password to access any of the nodes; ssh ark28, for instance, to test if you are able to login. Please contact sargis@scripps.edu to request additional account. Note that ark is not a Rocks cluster. Each node on ark has eth0 and eth1, and all eth0's are connected to a switch, which in turn is connected to the Scripps network though a single line. We don't have eth1 connecting to a Public Internet which is needed for Installing a Rocks Cluster. In any case, after consulting with William Young and Stefano Forli, I came across the following blog: Kickstarting CentOS onto a Dell Poweredge. That's not exactly how I did it, but it gave me leads on how to get started.
CentOS 5.2 has Red Hat Cluster Suite that supports High Availability Clustering, however, we need a High Performance Cluster (HPC) instead. To that end, I have installed Portable Batch System (PBS) similar to IT Services. I used TORQUE Resource Manager and installed pbs server on head node (ark28) and pbs clients on all compute nodes. Please read Job Submission section in the Torque documentation wiki for usage examples.
I also have Parallel Distributed Shell pdsh installed on node ark28.
Click Getting Started to learn more.
