The Cluster Summaries MonALISA Module

Introduction

The Ganglia Monitoring Tool provides average metrics per cluster. If you have installed version 2.5.4 or higher, you can connect to port 8652 on the host that runs the gmetad deamon and send the string:
            /?filter=summary
           
You get an XML file with averages about all clusters. In addition you may just request averages about a single cluster by simply passing the name of the Cluster:
           /STAR CAS Linux Cluster
           
If you pass the name of a host after the Cluster you get the metrics for that host:
           /STAR CAS Linux Cluster/rcas6156
           

We developed a MonALISA monitoring Module that pubishes the average metrics per cluster. The module opens a Socket to port 8652 to the host that runs the gmetad deamon and writes the string /?filter=summary to the Socket. Currently it only publishes the total number of CPUs (NCPUS) per cluster, the sum of load1 for all compute nodes in each cluster (Load1), the sum of load5 for all compute nodes in each cluster (Load5), the average load1 (Average1) which is calculated as Load1/NCPUS and the average load5 (Average5) which is calculated as load5/NCPUS. The module can of course be extended to publish averages of other metrics that are available in Ganglia.

Installation of the module

Accessing the Cluster Summaries

Since we run the MonALISA service as a web service you may access the information we publish in MonALISA using the Web Service Clients. You may also access the same informaton and make plots using the Global MonALISA client.

Stratos Efstathiadis - page was last modified