WallMon : Interactive distributed monitoring of process-level resource usage on display and compute clusters
To achieve low overhead, traditional cluster monitoring systems sample data at low frequencies and with coarse granularity. However, interactive monitoring requires frequent (up to 60 Hz) sampling of fine-grained data and visualization tools that can explore and display data in near real-time. This makes traditional cluster monitoring systems unsuited for interactive monitoring of distributed cluster applications, as they fail to capture short-duration events, making understanding the performance relationship between processes on the same or different nodes difficult. To address this issue, WallMon was developed, a tool for interactive visual exploration of performance behaviors in distributed systems. For gathering of data, WallMon is centered around an abstraction of collectors and handlers; collectors gathers data of interest, such as CPU and memory usage, and forwards it to handlers in a push-based fashion, while handlers take action upon the data. WallMon captures and visualizes data for every process on every node, as well as overall node statistics. Data is visualized using a technique inspired by the concept of information flocking. WallMon's design is based on the client-server model, and it is extensible through a module system that encapsulates functionality specific to monitoring (collectors) and visualization (handlers). A set of experiments have been carried out on a cluster of 29 nodes with 180 processes per node. Performance results show 7% (of 100) CPU usage at 64 Hz sampling rate when performing process-level monitoring with WallMon. Using WallMon's interactive visualization, we have observed interesting patterns in different parallel and distributed systems, such as unexpected ratio of user- and kernel-level execution among processes in a particular distributed system.
PublisherUniversitetet i Tromsø
University of Tromsø
The following license file are associated with this item:
Showing items related by title, author, creator and subject.
Improving the text compression ratio for ASCII text Using a combination of dictionary coding, ASCII compression, and Huffman coding Haldar-Iversen, Sondre (Mastergradsoppgave; Master thesis, 2020-11-15)Data compression is a field that has been extensively researched. Many compression algorithms in use today have been around for several decades, like Huffman Coding and dictionary coding. These are general-purpose compression algorithms and can be used on anything from text data to images and video. There are, however, much fewer lossless algorithms that are customized for compressing certain types ...
Diabetes Automata For Diabetes-Related Applications: Software Engine For Blood Glucose Level Simulation Agafonov, Aleksandr (Master thesis; Mastergradsoppgave, 2015-07-10)Diabetes Automata is a try of concept in the complex research-field of blood glucose simulation and prediction in experimental medical informatics, an experimental research project in software engineering combined with experimental health science. The project integrates together topics such as software system design and development, object-oriented programming, mobile application development, ...
Analysis of potential critical equipment and technical system on a modern PSV. Recommending a method for Troms Offshore Management AS Løvmo, Signy Anita (Master thesis; Mastergradsoppgave, 2016-06-01)This thesis is a part of a master’s degree in Technology and Safety in the High North at the University of Tromsø- The Arctic University of Norway. The thesis has been written during the spring semester of 2016. Safety is a large part of maritime operations and all tools to improve safety and reliability is considered. Even in these days when economy in the oil related industry is worse than ever. ...