Analysis of methylation data with imputation with blocks (MethylBlocks)

Meng, Wei; Ray, Mithlesh Kumar; Fenton, Christopher Graham; Anderssen, Endre; Paulssen, Ruth H

dc.contributor.author	Meng, Wei
dc.contributor.author	Ray, Mithlesh Kumar
dc.contributor.author	Fenton, Christopher Graham
dc.contributor.author	Anderssen, Endre
dc.contributor.author	Paulssen, Ruth H
dc.date.accessioned	2023-10-06T06:13:20Z
dc.date.available	2023-10-06T06:13:20Z
dc.date.issued	2022
dc.description.abstract	<p><i>Background -</i> Analysis of methylation data is dependent on two inputs. The total number of reads at any given cytosine site (coverage), and how many of those cytosines are methylated (methylation). Both the large number of sites and low coverage can cause problems in the analysis of methylation data. Likewise finding DMR is computationally and statistically challenging. Here we present a new method (MethyBlock) that highlights candidate regions by balancing co-variation and chromosomal distance. <p><i>Methods -</i> Raw data were processed by Bismark. Sample sites under a user-defined coverage were set to zero. Furthermore, sites with too many coverage zeros across all samples were removed. Data is divided into chromosomes, each chromosome divided into segments based on the distance to the neighboring CpG site. Relative methylation level data is calculated by dividing methylation counts by coverage counts. Missing values were imputed using K-nearest neighbor (KNN) per segment. Imputed segments were divided into blocks by balancing covariability and distance. Blocks were further filtered for outliers. The blocks and the imputed relative methylation matrix are kept for further analysis. <p><i>Results -</i> With imputation on a certain percentage of the samples, MethyBlock returned more base pairs but grouped into smaller regions than the DMRs generated from DMRseq. This may allow the finding of more specific blocks within large CpG regions. MethyBlock also returned a similar percentage of CpG islands, regulatory regions, and functional annotations. Comparison with DMRseq shows a similar percentage in UCSC hg38 overlapped regulatory regions. <p><i>Conclusions -</i> MethyBlock simplifies the downstream computational and statistical analysis by reducing the data to the region level. This method improves the power of statistical tests by reducing the impact of multiple testing.	en_US
dc.description	Poster presented at the Norwegian Bioinformatics Days 2022, Sandvollen, 28-30 September 2022.	en_US
dc.identifier.cristinID	FRIDAID 2082662
dc.identifier.uri	https://hdl.handle.net/10037/31476
dc.language.iso	eng	en_US
dc.rights.accessRights	openAccess	en_US
dc.rights.holder	Copyright 2022 The Author(s)	en_US
dc.title	Analysis of methylation data with imputation with blocks (MethylBlocks)	en_US
dc.type	Conference object	en_US
dc.type	Konferansebidrag	en_US

Tilhørende fil(er)

Navn:: article.pdf
Størrelse:: 319.9Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Artikler, rapporter og annet (klinisk medisin) [1975]

Vis enkel innførsel