ub.xmlui.mirage2.page-structure.muninLogoub.xmlui.mirage2.page-structure.openResearchArchiveLogo
    • EnglishEnglish
    • norsknorsk
  • Velg spraaknorsk 
    • EnglishEnglish
    • norsknorsk
  • Administrasjon/UB
Vis innførsel 
  •   Hjem
  • Fakultet for naturvitenskap og teknologi
  • Institutt for informatikk
  • Artikler, rapporter og annet (informatikk)
  • Vis innførsel
  •   Hjem
  • Fakultet for naturvitenskap og teknologi
  • Institutt for informatikk
  • Artikler, rapporter og annet (informatikk)
  • Vis innførsel
JavaScript is disabled for your browser. Some features of this site may not work without it.

Data-intensive computing infrastructure systems for unmodified biological data analysis pipelines

Permanent lenke
https://hdl.handle.net/10037/8816
DOI
https://doi.org/10.1007/978-3-319-24462-4_22
Thumbnail
Åpne
article.pdf (413.7Kb)
(PDF)
Dato
2015-11-18
Type
Journal article
Tidsskriftartikkel
Peer reviewed

Forfatter
Bongo, Lars Ailo; Pedersen, Edvard; Ernstsen, Martin
Sammendrag
Biological data analysis is typically implemented using a deep pipeline that combines a wide array of tools and databases. These pipelines must scale to very large datasets, and consequently require parallel and distributed computing. It is therefore important to choose a hardware platform and underlying data management and processing systems well suited for processing large datasets. There are many infrastructure systems for such data-intensive computing. However, in our experience, most biological data analysis pipelines do not leverage these systems. We give an overview of data-intensive computing infrastructure systems, and describe how we have leveraged these for: (i) scalable fault-tolerant computing for large-scale biological data; (ii) incremental updates to reduce the resource usage required to update large-scale compendium; and (iii) interactive data analysis and exploration. We provide lessons learned and describe problems we have encountered during development and deployment. We also provide a literature survey on the use of data-intensive computing systems for biological data processing. Our results show how unmodified biological data analysis tools can benefit from infrastructure systems for data-intensive computing.
Beskrivelse
Accepted manuscript version. The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-24462-4_22.
Forlag
Springer
Sitering
Lecture Notes in Computer Science 2015, 8623:259-272
Metadata
Vis full innførsel
Samlinger
  • Artikler, rapporter og annet (informatikk) [484]

Bla

Bla i hele MuninEnheter og samlingerForfatterlisteTittelDatoBla i denne samlingenForfatterlisteTittelDato
Logg inn

Statistikk

Antall visninger
UiT

Munin bygger på DSpace

UiT Norges Arktiske Universitet
Universitetsbiblioteket
uit.no/ub - munin@ub.uit.no

Tilgjengelighetserklæring