How do I submit my data?
Data can be input into the box or uploaded from a file on your desktop in the following file formats:
- dbSNP IDs
- 0-based coordinates (as chr#[tab]min_coord[tab] max_coord or in a BED or VCF file format)
- 1-based coordinates (as chr#:min_coord..max_coord or in a GFF3 file format)
- BED format (View file format specifications)
- VCF format (View file format specifications)
- GFF3 format (View file format specifications)
What is displayed on the summary of SNP analysis page?
A summary of the total number of rows analyzed and coordinates searched will be displayed in addition to any errors located in the file. The rest of the page includes the nucleotides entered in the query and the data associated with the nucleotides. The table contains the following columns of data:
- dbSNP ids
- dbSNP ID: If available, the dbSNP id for that coordinate is displayed.
- 1-based coordinates (as chr#:min_coord..max_coord or in a GFF3 file format)
- RegulomeDB Score: This is a computed score based on the integration of multiple high-throughput datasets. Additional details are described in the next question.
- Other Resources: links to external resources that provide additional information for the genomic region or dbSNP are provided.
What does the RegulomeDB score represent?
The scoring scheme refers to the following available datatypes for a single coordinate.
Score | Supporting data |
---|---|
1a | eQTL + TF binding + matched TF motif + matched DNase Footprint + DNase peak |
1b | eQTL + TF binding + any motif + DNase Footprint + DNase peak |
1c | eQTL + TF binding + matched TF motif + DNase peak |
1d | eQTL + TF binding + any motif + DNase peak |
1e | eQTL + TF binding + matched TF motif |
1f | eQTL + TF binding / DNase peak |
2a | TF binding + matched TF motif + matched DNase Footprint + DNase peak |
2b | TF binding + any motif + DNase Footprint + DNase peak |
2c | TF binding + matched TF motif + DNase peak |
3a | TF binding + any motif + DNase peak |
3b | TF binding + matched TF motif |
4 | TF binding + DNase peak |
5 | TF binding or DNase peak |
6 | other |
What details are provided for the datatypes supporting a SNP?
This page lists all the DNA features and regulatory regions that have been identified to contain the input coordinate.
- Transcription factor binding sites
- Position-Weight Matrix for TF binding (PWM)
- DNase Footprinting
- DNase sensitivity
- Chromatin States
- eQTLs
- Differentially methylated regions
- Manually curated regions
- Validated functional SNPs
What data is currently available at RegulomeDB?
RegulomeDB currently query the following data types
Transcription factor binding sites
ChIP factors: 740 unique data sets including most recent ENCODE data release (2012 Freeze).
Xie et al. (2013) and Boyle et al. (2014)
Position-Weight Matrix for TF binding (PWM)
- JASPAR CORE
- TRANSFAC
- UniPROBE
- Jolma et al.
DNase sensitivity
204 unique datasets including most recent ENCODE data release.
ENCODE Project Consortium
Chromatin States
Roadmap Epigenome Consortium 127 standard epigenomes.
eQTLs / dsQTLs
Tissue types:
- Cerebellum
- Cortex
- Fibroblasts
- Frontal-Cortex
- Liver
- Lymphoblastoid
- Monocytes
- Pons
- T-cells
- Temporal-Cortex
DNase Footprinting
Differentially Methylated regions
Kuleshov et al.
Manually curated regions
Validated functional SNPs
What version of dbSNP is RegulomeDB querying?
RegulomeDB is currently querying build 141 of dbSNP. See NCBI for additional information about dbSNP141.
What version of the human genome sequence are the data mapped to at RegulomeDB?
All data at RegulomeDB is currently mapped to hg19. Additional information about the human reference genome can be found at the Genome Reference Consortium
Why is there no data for my chromosomal region?
Entering a chromosomal region will identify all common SNPs (with an allele frequency > 1%) in that region. Theses SNPs are used to query the RegulomeDB. If there are no common SNPs in the uploaded genomic regions, there will be no data available. However, the chromosomal region can be uploaded as split single nucleotide values in order to query each nucleotide individually.
Alternatively, the region you entered could be in a protein-coding region of the genome. Currently, RegulomeDB only integrates and curates high-throughput data from non-coding and intergenic regions of the human genome.