Now showing items 1-10 of 13
Using a virtual event space to understand parallel application communication behavior
(Research report; Forskningsrapport, 2003)
We have developed EventSpace, a configurable data collecting, management and observation system for monitoring low-level synchronization and communication events with the purpose of understanding the behavior of parallel applications on clusters and multi-clusters. Applications are instrumented by adding data collecting code in the form of event collectors to an applications communication paths. ...
Evaluating the performance of the allreduce collective operation on clusters. Approach and results
(Research report; Forskningsrapport, 2004)
The performance of the collective operations provided by a communication library is important for many applications run on clusters. The communication structure of collective operations can be organized as a tree. Performance can be improved by configuring and mapping the tree to the clusters in use. We describe and demonstrate an approach for evaluating the performance of different configurations ...
Transparent Incremental Updates for Genomics Data Analysis Pipelines
(Chapter; Bokkapittel, 2014)
The Longcut Wide Area Network Emulator. Design and Evaluation
(Research report; Forskningsrapport, 2005)
Experiments run on a Grid, consisting of clusters administered by multiple organizations connected by shared wide area networks (WANs), may not be reproducible. First, traffic on the WAN cannot be controlled. Second, allocating the same resources for subsequent experiments can be difficult. Longcut solves both problems by splitting a single cluster into several parts, and for each part having one ...
Mr. Clean: A Tool for Tracking and Comparing the Lineage of Scientific Visualization Code
(Conference object; Konferansebidrag, 2014)
IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks
(Journal article; Tidsskriftartikkel; Peer reviewed, 2012)
Integrative multi-species prediction (IMP) is an interactive web server that enables molecular biologists to interpret experimental results and to generate hypotheses in the context of a large cross-organism compendium of functional predictions and networks. The system provides a framework for biologists to analyze their candidate gene sets in the context of functional networks, as they expand or ...
Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes
(Journal article; Tidsskriftartikkel; Peer reviewed, 2013)
A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathways and thus are often unable to identify additional genes participating in processes that are not ...
Kvik: three-tier data exploration tools for flexible analysis of genomic data in epidemiological studies
(Journal article; Tidsskriftartikkel; Peer reviewed, 2015-03-30)
Kvik is an open-source system that we developed for explorative analysis of functional genomics data from large epidemiological studies. Creating such studies requires a significant amount of time and resources. It is therefore usual to reuse the data from one study for several research projects. Often each project requires implementing new analysis code, integration with specific knowledge bases, ...
Data-intensive computing infrastructure systems for unmodified biological data analysis pipelines
(Journal article; Tidsskriftartikkel; Peer reviewed, 2015-11-18)
Biological data analysis is typically implemented using a deep pipeline that combines a wide array of tools and databases. These pipelines must scale to very large datasets, and consequently require parallel and distributed computing. It is therefore important to choose a hardware platform and underlying data management and processing systems well suited for processing large datasets. There are many ...
The metagenomic data life-cycle: standards and best practices
(Journal article; Tidsskriftartikkel; Peer reviewed, 2017-08-01)
Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine ...