WallMon : Interactive distributed monitoring of process-level resource usage on display and compute clusters
To achieve low overhead, traditional cluster monitoring systems sample data at low frequencies and with coarse granularity. However, interactive monitoring requires frequent (up to 60 Hz) sampling of fine-grained data and visualization tools that can explore and display data in near real-time. This makes traditional cluster monitoring systems unsuited for interactive monitoring of distributed cluster applications, as they fail to capture short-duration events, making understanding the performance relationship between processes on the same or different nodes difficult. To address this issue, WallMon was developed, a tool for interactive visual exploration of performance behaviors in distributed systems. For gathering of data, WallMon is centered around an abstraction of collectors and handlers; collectors gathers data of interest, such as CPU and memory usage, and forwards it to handlers in a push-based fashion, while handlers take action upon the data. WallMon captures and visualizes data for every process on every node, as well as overall node statistics. Data is visualized using a technique inspired by the concept of information flocking. WallMon's design is based on the client-server model, and it is extensible through a module system that encapsulates functionality specific to monitoring (collectors) and visualization (handlers). A set of experiments have been carried out on a cluster of 29 nodes with 180 processes per node. Performance results show 7% (of 100) CPU usage at 64 Hz sampling rate when performing process-level monitoring with WallMon. Using WallMon's interactive visualization, we have observed interesting patterns in different parallel and distributed systems, such as unexpected ratio of user- and kernel-level execution among processes in a particular distributed system.
ForlagUniversitetet i Tromsø
University of Tromsø
Følgende lisensfil er knyttet til denne innførselen:
Viser innførsler relatert til tittel, forfatter og emneord.
Diabetes Automata For Diabetes-Related Applications: Software Engine For Blood Glucose Level Simulation Agafonov, Aleksandr (Master thesis; Mastergradsoppgave, 2015-07-10)Diabetes Automata is a try of concept in the complex research-field of blood glucose simulation and prediction in experimental medical informatics, an experimental research project in software engineering combined with ...
Analysis of potential critical equipment and technical system on a modern PSV. Recommending a method for Troms Offshore Management AS Løvmo, Signy Anita (Master thesis; Mastergradsoppgave, 2016-06-01)This thesis is a part of a master’s degree in Technology and Safety in the High North at the University of Tromsø- The Arctic University of Norway. The thesis has been written during the spring semester of 2016. Safety ...
Innovation and commercialization potential of university-developed arctic ice-tethered platforms. A case study of research-based technology Agwu, Ukeje Jacob; Logvinovskaya, Anna; Phetchpinkaew, Gorn (Master thesis; Mastergradsoppgave, 2017-05-31)The Arctic is a region which is rapidly opening up for business opportunities. However, research has been abundant here for a long time. With this situation comes the avenue to transition technology used in the region from ...