Identifying and correcting epigenetics measurements for systematic sources of variation
AuthorPerrier, Flavie; Novoloaca, Alexei; Ambatipudi, Srikant; Baglietto, Laura; Ghantous, Akram; Perduca, Vittorio; Barrdahl, Myrto; Harlid, Sophia; Ong, Ken K; Cardona, Alexia; Polidoro, Silvia; Nøst, Therese Haugdahl; Overvad, Kim; Omichessan, Hanane; Dollé, Martijn; Bamia, Christina; Huerta, José María; Vineis, Paolo; Herceg, Zdenko; Romieu, Isabelle; Ferrari, Pietro
Results: A sizeable proportion of systematic variability due to variables expressing ‘batch’ and ‘sample position’ within ‘chip’ was identified, with values of the partial R2 statistics equal to 9.5 and 11.4% of total variation, respectively. After application of ComBat or the residuals’ methods, the contribution was 1.3 and 0.2%, respectively. The SVA technique resulted in a reduced variability due to ‘batch’ (1.3%) and ‘sample position’ (0.6%), and in a diminished variability attributable to ‘chip’ within a batch (0.9%). After ComBat or the residuals’ corrections, a larger number of significant sites (k = 600 and k = 427, respectively) were associated to smoking status than the SVA correction (k = 96).
Conclusions: The three correction methods removed systematic variation in DNA methylation data, as assessed by the PC-PR2, which lent itself as a useful tool to explore variability in large dimension data. SVA produced more conservative findings than ComBat in the association between smoking and DNA methylation.