DeepH&M: Estimating single-CpG hydroxymethylation and methylation levels from enrichment and restriction enzyme sequencing methods

Learn more

Abstract:

Increased appreciation of 5-hydroxymethylcytosine (5hmC) as a stable epigenetic mark, which defines cell identity and disease progress, has engendered a need for cost-effective, but high-resolution, 5hmC mapping technology. Current enrichment-based technologies provide cheap but low-resolution and relative enrichment of 5hmC levels, while single-base resolution methods can be prohibitively expensive to scale up to large experiments. To address this problem, we developed a deep learning–based method, “DeepH&M,” which integrates enrichment and restriction enzyme sequencing methods to simultaneously estimate absolute hydroxymethylation and methylation levels at single-CpG resolution. Using 7-week-old mouse cerebellum data for training the DeepH&M model, we demonstrated that the 5hmC and 5mC levels predicted by DeepH&M were in high concordance with whole-genome bisulfite–based approaches. The DeepH&M model can be applied to 7-week-old frontal cortex and 79-week-old cerebellum, revealing the robust generalizability of this method to other tissues from various biological time points.