Decoding the genetic basis of
congenital heart disease


Data type Files Remarks
Literature-curated genetic perturbation data Cardiaccode_evidence_2014_06_10.txt
55 peer-reviewed primary research articles from 1994 to 2013 were reviewed to manually extract 1391 genetic or molecular perturbation data in mouse tissues that were associated with embryonic mouse heart development. All data were derived from two general classes of experiments: (1) organ culture data; and (2) transgenic mouse data. Each data point consists of a regulator gene (the gene in which expression was perturbed), a target gene (the gene in which mRNA or protein expression was measured), the tissue studied, and the developmental stage of the tissues used. The regulator gene was determined to have a positive, negative, or no influence on the target gene based the author's original conclusion after confirmation with their primary data in figures or tables.
The in vivo dataset contains 710 in vivo Mus musculus genetic or molecular perturbation data, filtered from the 1391 data in the full dataset.
We reviewed primary research articles from 1997 to 2014 to manually extract 590 genetic or molecular perturbation data from heart valve tissues (embryonic or postnatal) from various organisms or cell lines. See description above for more detail.
Microarray gene expression profiles Cardiaccode_microarray_2014_06_10.txt
86 microarray gene expression profiles were collected from 9 data series from the Gene Expression Omnibus (GEO) database. All samples are from embryonic mouse heart development, and were profiled by the Affymetrix Mouse Genome 430 2.0 platform. We checked the quality of the data, then normalised the data using RMA. The data shows log2 intensity values.
Human postnatal cultured valve endothelial and interstitial cells were profiled using Illumina’s Human HT-12 v4.0 expression beadchip microarrays. Background subtracted data were subjected to log2 transformation and robust spline normalization, using the lumi package in the R statistical environment. After filtering for genes that have detection p-value < 0.05 in three or more of the 18 samples, the final data set contains 16,987 genes.
From the microarray data, the limma R package was used to identify differentially expressed genes based on BMP4 and TGFB3 treatment, which formed the edges of the network. The network was then filtered to only include genes that were related to endothelial-mesenchymal transition (EndMT) and/or were part of a signalling pathway.
Epigenomics datasets Epigenomic_datasets_2014_08_12.pdf
Links to published epigenetic datasets related to heart development and disease, including histone modifications, chromatin accessibility, transcription factors, regulatory regions (enhancers), DNA methylation etc, performed using ChIP-seq, DNase-Seq, FAIRE-seq, Bisulfite-seq, MRE-seq and CAGE-seq experiments.