About
The Department of Computer Science maintains a state of the art clustered High Performance Computing Facility (HPCF or, more colloquially, a "cluster") suitable for research in areas of computer science and engineering that may require substantial computational effort. This facility consists of 30 computing nodes, each with two 1 GHz processors and 2 GB of memory, communicating with each other over a high-performance gigabit network switch. The nodes are connected to a dual-processor I/O server with two 1.5 GHz processors and 3 GB of memory providing access to approximately 700 GB of fast redundant fault-tolerant disk storage.
Access
The HPCF is provided to support research computing at Stevens Institute of Technology. Since the HPC facilities are limited in their capacity, access will be provided only for pre-approved, specific research projects that are associated with an identified, faculty-level researcher. Such researchers will apply to the System Administrators to obtain access to the HPC facilities. Access will be granted or denied based on the following selection criteria:
- justification of the need for large computational capacity in the proposed research,
- a clearly articulated argument that the use of specific HPC resources will help to effectively solve the research problem,
- the current level of use of the facilities.
Item (3) means that it may sometimes be necessary for researchers with well-justified projects to be denied access, temporarily, because there is insufficient capacity available. Research users other than faculty-level researchers (e.g. graduate and undergraduate students and postdoctoral researchers) who are affiliated with an approved project may also use the facilities at the discretion of the corresponding faculty level researcher.
If an application is approved, the applicant will be expected to sign a use agreement that identifies the terms of use (relative to the application made) and which references the rights and responsibilities of users. Access will then be provided to the applicant and to his/her identified research partners (colleagues, grad students, post-docs, etc.), each of whom must also sign a use agreement.
The agreement is available here, and should be signed and returned to the system administrators.
In order to better analyze the usage of the HPC facilities as well as demonstrating the effectiveness of the facilities and to support applications for the expansion of the facilities, researchers are requested to report the results, major milestones etc. of their research as they become available. Accounts that remain inactive for an extended period of time may be disabled or terminated.
The System Administrators reserve the right to implement and/or install resource sharing software to meet the needs of all users and to reprioritize running jobs accordingly.
Please contact the system administrators in order to apply for an account.
Terms of Service
Each user of the HPC facilities has certain rights and responsibilities. Users who have been granted access to the facilities have the following rights:
- The right to access the facilities and run their jobs, subject to resourcesharing policies in effect at the time.
- Access to technical support to help them getting required software and other materials (e.g. large data sets) installed and to assist them in developing their HPC-based research computing applications. Technical support is provided to the best of our capabilities but can only be limited depending on the given task or other users needs.
- Access to backups in the case of an emergency. While backups are performed regularly, restoring files from backup require a significant amount of time, so requests for restoring of data should only be done in the case of an emergency, and shall only be provided if obviously necessary.
In return for these rights, the users of the HPC facilities are responsible for:
- Ensuring that the machine is used as efficiently as possible and only for the agreed-to research purposes.
- Ensuring that unauthorized users do not gain access to the HPC facilities. This includes protecting your accounts from unauthorized access by taking reasonable measures in your day-to-day use of your account. Absolutely NO password sharing will be allowed and if detected it will be considered grounds for revoking access.
- Ensuring that they do not interfere with the work of others in any way. This includes attempts to circumvent scheduling policies and/or manipulation of other users' files (whether or not they may have protected them).
- Users must properly acknowledge the use of the facility in any published work or public presentation arising from its use.
Hardware/Software
The server as well as the nodes are running the NetBSD operating system. If you are interested in clustering NetBSD, please subscribe to the tech-cluster mailing list on the NetBSD website. All standard software is installed per default; a list of the third-party applications installed is available at here. More detailed information regarding the software is available at here.
A diagram of the setup is available here.
File Server:
- 2 1.5 GHz AMD Athlon processors
- 3 GB RAM
- Approx. 700 GB of storage via a 3ware Escalada SATA RAID
- 1000Base-T GBIC Ethernet interfaces
- dmesg output
- 2 1 GHz Pentium III processors
- 2 GB RAM
- 1000Base-T Gigabit Ethernet interfaces for internal communication
- dmesg output
- Extreme Networks Summit7i Switch for communication among the nodes
Using the cluster
In this section, you will find some documentation on how to use the resources made availale through the HPCF. This section will grow with time. Commentary, questions, suggestions and corrections should be sent to the hpc-discuss mailing list.
Currently available information:
- Login is only possible using ssh hpcf.cs.stevens-tech.edu
(fingerprints). This will log you into one of the nodes. From there,
you can use rsh
to log into other nodes if necessary. - A simple example program showing how to use MPI: trapezoid estimates the integral from 0.0 to 1.0 of f(x) = sin(x) using the trapezoidal rule (with 1024 trapezoids).
- A commented Makefile to compile and run the example program.
- If you would like to use CVS to manage your project, please refer to this document for details on how to set up your repository.
- Since Stevens has a site-wide license for Matlab, we now have Matlab 7.0 (R14) is installed on the cluster. At the moment, we do not have any parallelizing toolboxes installed, but if you're interested, you may want to take a look at Parallel Programming with MatlabMPI. If this is determined to be a good addition, please file a PR and we will install the software system wide.
Support
The HPCF was originally established with grant support. Maintenance and expansion of the HPCF relies on continued grant support, acquired by users of the facility.
A mailing list for all discussions regarding High Performance Computing at Stevens in general and the facilities provided by the Department of Computer Science in particular is available. Please see the hpc-discuss mailing list.
In addition, there is a mailing list that gets automated notifications of alerts, such as temperature alerts or detected network outages etc. Please see the hpc-alert mailing list. Note that at times when cooling is not functioning in the machine room there are a large number of different alert emails sent to this list. Since each email contains a specific pattern indicating the level of the alert, you could filter these mails using a variation of this procmail recipe if you only want to receive a subset of these mails.
If there are any problems, please submit a Problem Report here.
Please also refer to the general support page.