HERVH
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000183 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Catarrhini |
Length | 7713 |
Kimura value | 5.61 |
Tau index | 0.8930 |
Description | Internal region of ERV1 endogenous retrovirus, HERVH subfamily |
Comment | The long terminal repeat associated with HERVH is LTR7. The putative env gene, consisting of about 1800 base pairs, has two open reading frames interrupted by a termination codon. The amino acid sequence of this region shows significant homology to those of other retroviral envelope proteins and contains eight potential glycosylation sites. It is estimated that there are about 100 copies of RTVL-H elements containing the env gene per haploid human genome. Note that this estimate is distinct from simply tallying all hits (long and short) from this model. |
Sequence |
TTTGGTGCCGTGACTCGGATCGGGGGACCTCCCTTGGGAGATCAATCCCCTGTCCTCCTGCTCTTTGCTCCGTGAGAAAGATCCACCTACGACCTCAGGTCCTCAGACCGACCAGCCCAAGGAACATCTCACCAATTTCAAATCCGGTAAGCGGCCTCTTTTTACTCTCTTCTCCAACCTCCCTCACTATCCCTCAACCTCTTTCTCCTTTCAATCTTGGCGCCACACTTCAATCTCTCCCTTCTCTTAATTTCAATTCCTTTCATTTTCTGGTAGAGACAAAGGAGACACGTTTTATCCGTGGACCCAAAACTCCGGCGCCGGTCACGGACTGGGAAGGCAGCCTTCCCTTGGTGTTTAATCATTGCAGGGACGCCTCTCTGATTATTCACCCACGTTTCAGAGGTGTCAGACCACGCAGGGACGCCTGCCTTGGTCCTTCACCCTTAGCGGCAAGTCCCGCTTTTCTGGGGGAGGGGCAAGTACCCCAACCCCTTCTCTCCGTGTCTCTACCCCTTCTCCGCTTTTCTGGGGNAGGGGCAAGNACCCCTCAACCCCTTCTCCTTCACCCTTAGCGGCAAGTCCCGCTTTTCTAGGGGGGCAAGAACCCCCAACCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCAATCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCGATCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCAACCCCTTATTTCCGTGCCCCGACCCCTTTCCCGCTTTTCTGGAGGGCAAGAACCCCCGAACCCCTTCCCTCCGTGTCTCTACTCTCTCTTTTCTCTGGGCTTGCCTCCTTCACTATGGGCAACCTTCCACCCTCCATTCCTCCTTCTCCCTTAGCCTGTGTTCTCAAGAACTTAAAACCTCTTCAACTCACACCTGACCTAAAACCTAAACGCCTTATTTTCTTCTGCAATGCCGCTTGACCCCAATACAAACTCGACAGTAGTTCCAAATAGCCAGAAAACGGCACTTTCAATTTTTCCATCCTACAAGATCTAAATAATTCTTGTCGTAAAATGGGCAAACGGTCTGAGGTGCCTGACGTCCAGGCATTCTTTTACACATCGGTCCCTCCCTAGTCTCTGTNCCCAGTGCAACTCGTCCCAAATCTTCCTTCTTTCCCTCCCGCCTGTCCCCTCAGTCCCAACCCCAAGCGTCGCTGAGTCTTTCTAATCTTCCTTTTCTACAGACCCATCTGACCTCTCCCCTCCTCGCCAGGCCGAGCTAGGTCCCAATTCTTCCTCAGCCTCCGCTCCTCCACCCTATAATCCTTTTATCACCTCCCCTCCTCACACCCGGTCCGGCTTACAGTTTCGTTCCGTGACTAGCCCTCCCCCACCTGCCCAGCAATTTACTCTTAAAAAGGTGGCTGGAGCTAAAGGCATAGTCAAGGTTAATGCTCCTTTTTCTTTATCCGNNNNNTCCCAAATCAGATAGCGTTTAGGCTCTTTTTCATCAAATATAAAAANCCCAGCCCAGTTCATGGCTCGTTTGGCAGCAACCCTGAGACGCTTTACAGCCCTAGACCCTAAAAGGTCAAAAGGCCGTCTTATTCTCAATATACATTTTATTACCCAATCTGCTCCCGACATTAAATAAAACTCCAAAAATTAAATTCCGGCCCTCAAACCCCACAACAGGACTTAATTAACCTCGCCTTCAAGGTGTACAATAATAGAAAAAAGTTGCAATTCCTTGCCTCCACTGTGAGACAAACCCCAGCCACATCTCCAGCACACAAGAACTTCCAAACGCCTGAACCGCAGCGGCCAGGCGTTCCTCCAGAACCTCCTCCCCCAGGAGCTTGCTACAAGTGCCGGAAATCTGGCCACCGGGCCAAGGAATGCCCGCAGCCCGGGATTCCTCCTAAGCCGCGTCCCATCTGTGCGGGACCCCACTGGAAATCGGACTGTTCAACTCACCTGGCAGCCACTCCCAGAGCCCCTGGAACTCTGGCCCAAGGCTCTCTGACTGACTCCTTCCCAGATCTTCTCGGCTTAGCGGCTGAAGACTGACGCTGCCCGATCGCCTCGGAAGCCCCCTGGACCATCACGGACGCTTCGGGTAACTCTCACAGTGGAGGGTAAGTCCGTCCCCTTCTTAATCAATACGGAGGCTACCCACTCCACATTACCTTCTTTTCAAGGGCCTGTTTCCCTTGCCTCCATAACTGTTGTGGGTATTGACGGCCAGGCTTCTAAACCTCTTAAAACTCCCCAACTCTGGTGCCAACTTAGACAATACTCTTTTAAGCACTCCTTTTTAGTTATCCCCACCTGCCCAGTTCCCTTATTAGGCCGAGACACTTTAACTAAATTATCTGCTTCCCTGACTATTCCTGGACTACAGCCACATCTCATTGCCGCCCTTCTCCCCAATCCAAAGCCTCCTTCGCGTCCTCTTGTATCCCCCCACCTTAACCCACAAGTATAAGATACCTCTACTCCCTCCTTGGCGACCGATCATGCACCCCTTACCATCTCATTAAAACCTAATCACCCTTACCCCGCTCAACGCCAATATCCCATCCCACAGCACGCTTTAAAAGGATTAAAGCCTGTTATCACTCGCCTGCTACAGCATGGCCTTTTAAAGCCTATAAACTCTCCTTACAATTCCCCCATTTTACCTGTCCTAAAACCAGACAAGCCTTACAGGTTCAGGATCTGCGCCTTATCAACCAAATTGTTTTGCCTATCCACCCCGTGGTGCCAAACCCATATACTCTCCTATCCTCAATACCTCCCTCCACAACCCATTATTCTGTTCTGGATCTCAAACATGCTTTCTTTACTATTCCTTTGCACCCTTCATCCCAGCCTCTCTTCGCTTTCACTTGGACTGACCCTGACACCCATCAGGCTCAGCAAATTACCTGGGCTGTACTGCCGCAAGGCTTCACAGACAGCCCCCATTACTTCAGTCAAGCCCTTCCTCATGATTTACTTTCTTTCCACCCCTCCGCTTCTCACCTTATTCAATATATTGATGACCTTCTNCTTTGTAGCCCCTCCTTTGAATCTTCTCAACAAGACACNCTNCTGCTCCTTCANCATTTATTCTCCAAAGGATATCCCCCTCCAAAGCCCAAATTTCTTCCTCATCTGTTACCTATCTCGGCATAATTCTTCATAAAAACACACGTGCTCTCCCTGCCGATCGTGTCCGACTGATCTCTCAAACCCCAACCCCTTCTACAAAACAACAACTCCTTTCCTTCCTAGGCATGGTTGGATACTTTCGCCTTTGGATACCTGGTTTTGCCATCCTAACAAAACCATTATATAAACTCACAAAAGGAAACCTAGCTGACCCCATAGATCCTAAATCCTTTCCCCACTCCTCTTTCCGTTCCTTGAAGACAGCTTTAGAGACTGCCCCCACCCTAGCTCTCCCTGACTCATCCCAACCCTTTTCATTACACACAGCCGAAGTGCAGGGCTGTGCAGTCGGAATTCTTACACAAGGACCGGGACCGCGCCCTGTAGCCTTTTTGTCCAAACAACTTGACCTTACTGTTTTAGGCTGGCCATCATGTCTCCGTGCGGCGGCTGCCGCCGCCCTAATACTTTTAGAGGCCCTCAAAATCACAAACTATGCTCAACTCACTCTCTACAGTTCTCATAACTTCCAAAATCTATTTTCTTCCTCACACCTGACGCATATACTTTCTGCTCCCCGGCTCCTTCAGCTGTACTCACTCTGTTGAGTCTCCCACAATTACCATTGTTCCTGGCCCGGACTTCAATCCGGCCTCCCACATTATTCCTGATACCACACCTGACCCCCATGACTGTATCTCTCTGATCCACCTGACATTCACCCCATTTCCCCATATTTCCTTCTTTCCTGTTCCTCACCCTGATCACACTTGGTTTATTGATGGCAGTTCCACCAGGCCTAATCGCCACACACCAGCAAAGGCAGGCTATGCTATGAACTCGTTGCCTTAACTCGAGCCCTCACTCTTGCAAAGGGACTACGCGTCAATATTTATACTGACTCTAAATATGCCTTCCATATCCTGCACCACCATGCTGTTATATGGGCTGAAAGAGGTTTCCTCACTACGCAAGGGTCCTCCATCATTAATGCCTCTTTAAAAAAACTCTTCTCAAGGCCGCTTTACTTCCAAAGGAAGCTGGAGTCATTCACTGCAAGGGCCATCAAAAGGCATCAGATCCCATCGCTCAGGGCAACGCTTATGCTGATAAGGTAGCTAAAGAAGCAGCTAGCGTTCCAACTTCTGTCCCTCACGGCCAGTTTTTCTCCTTCTCATCGGTCACTCCCACCTACTCCCCCGCTGAAACTTCCACCTATCAATCTCTTCCCACACAAGGCAAATGGTTCTTGGACCAAGGAAAATATCTCCTTCCAGCCTCACAGGCCCATTCTATTCTGTCGTCATTTCATAACCTCTTCCATGTAGGTTACAAGCCGCTAGCCCGCCTCTTAGAACCTCTCATTTCCTTTCCATCGTGGAAATCTATCCTCAAGGAAATCACTTCTCAGTGTTCCATCTGCTATTCTACTACTCCTCAGGGATTGTTCAGGCCCCCTCCCTTCCCTACACATCAAGCTCGGGGATTTGCCCCCGCCCAGGACTGGCAAATTGACTTTACTCACATGCCCCGAGTCAGGAAACTAAAATACCTCTTGGTCTGGGTAGACACTTTCACTGGATGGGTAGAGGCCTTTCCCACAGGGTCTGAGAAGGCCACCGCGGTCATTTCTTCCCTTCTGTCAGACATAATTCCTCGGTTTGGCCTTCCCACCTCTATACAGTCCGATAACGGACCGGCCTTTATTAGTCAAATCAGCCAAGCAGTTTCTCAGGCTCTTGGTATTCAGTGAAACCTTTATATCCCTTACGGTCCTCAGTCTTCAGGAAAGGTAGAACGGACTAATGGTCTTTTAAAAACACACCTCACCAAGCTCAGCCACCAACTTAAAAAGGACTGGACAATACTTTTACCACTTTCCCTTCTCAGAATTCGGGCCTGTCCTCGGAATGCTACAGGGTACAGCCCATTTGAGCTCCTGTATGGACGCTCCTTTTTATTAGGCCCCAGTCTCATTCCAGACACCAGACCTCTAGGCGACTATCTTCCAGTCCTCCAGCAGGCTAGACAGGAAATTCGCCAGGCTGCTAATCTTCTCTTGCCTACTCCAGATCCCCAGCCATATGAAGACACCCTAGCTGGACGATCAGTTCTTGTTAAGAATCTGACCCCTCAAACTCTACAACCTCGATGGACCGGACCCTACTTAGTCATCTATAGTACCCCGACTGCCGTCCGCCTGCAGGATCCTCCCCACTGGGTTCACCGTTCCAGAATAAAGCTGTGTCCGTCGGACAGCCAGCCTAATCCCTCCTCTTCCTCCTGGAAGTCGCAAGTACTCTCCCCTACTTCCCTTAAACTCACTCGCATTTCTGAAGAACAGTAATAACCCTTATGAGCCTAATACATCCCTTCATTCTATTAGGTCTGTTCGTCCTTACCCTACTTTTTGCAACAGGGCTTTACGNAGTCACCCCCACCACTTGGACCGAGCCCCAAAAAACTTGTCATCCCTACTATCTTCTGTCTAGTCATACTCCTATTCNCCGTTCTCAACTACTCATAAATGCCCTACTCTTGTTTACACTGCCGGTTTACACTGTTTCTCCAAGCCATCACAGCTGATATCTCCTGGTGCTATCCCCAAACCGCCACTCTTAACTCCCTCTTAGAGTGGATAGATGATCTTTGCTGGCAGGGCACCCTCCAATACTTTCACCCTGATGAAGTTCTATTCTTTACTTTTATACTCACTCTTATTCTCATTCCCATTCTTATGCCACCCTCTACCTCTCCCCAGCTATCTCCACCACACTATCAACCTTACCCATTCTCTCCTAGCCGTTTCTAATCCCTCCTTAGCGAACAACTGCTGGCTTTGCATTTCCCTTTCTTCCAGCGCCTACACAGCTGTCCCCGCCTTACANACAGACTGGGCAACATCTCCTGTCTCCCTACACCTCCGAACTTCCTTTAACAGCCCTCACCTTTACCCTCCTGAAGAACTCATTTACTTTCTAGACAGGTCCAGCAAGACCTCCCCAGACATTTCACATCAGCAAGCTGCCGCCCTCCTCCGCACTTACTTAAAAAACCTTTCTCCTTATATCAACTCTACTCCCCCCATATTTGGACCTCTCACAACACAAACTACTATTCCTGTGGCCGCTCCTTTATGTATCTCTCGGCAAAGACCCACTGGAATTCCCCTAGGTAACCTTTCACCTTCTCGATGTTCCTTTACTCTTCATCTCCGAAGCCCAACTACACACATCACTGAAACAATTGGAGCCTTCCAGCTCCATATTACAGACAAGCCCTCTATCAATACTGGCAAACTTAAAAACATTAGCAGTAATTATTGCTTAGGAAGACACTTACCCTGTATTTCACTCCATCCTTGGCTACCTTCCCCTTGCTCGTCAGACTCTCCTCCCAGGCCCTCTTCTTGTTTACTTATACCCAGCCCCGAAAATAACAGTGAAAGGTTGCTCGTAGATACTCAACGTTTTCTCATACACCATGAAAATCGAACCTCCCCCTCTACGCAGTTACCCCATCAGTCCCCATTACAACCTCTGACGGCTGCCGCCCTAGCTGGATCCCTAGGAGTCTGGGTACAAGACACCCCTTTCAGCACTCCTTCTCATCTTTTTACTTTGCATCTCCAGTTTTGCCTCGCACAAGGTCTCTTCTTCCTCTGTGGATCCTCTACCTACATGTGTCTACCTGCTAATTGGACAGGCACATGCACACTAGTTTTCCTTACCCCCAAAATTCAATTTGCAAATGGGACCGAAGAGCTCCCTGTTCCCCTCATGACACCGACACGACAAAAAAGAGTTATTCCACTAATTCCCTTGCTNGTCGGTTTAGGACTTTCTGCCTCCACTATTGCTCTCGGTACTGGAATAGCAGGCATTTCAACCTCTGTCACGACCTTCCGTAGCCTCTCTAATGACTTCTCTGCTAGCATCACAGACATATCACAAACTTTATCAGTCCTCCAGGCCCAAGTTGACTCTTTAGCTGCAGTTGTCCTCCAAAACCGCCGAGGCCTCGACTTACTCACTGCTGAAAAAGGAGGACTCTGTATATTCTTAAATGAAGAGTGTTGTTTTTACCTAAATCAATCTGGCCTGGTGTATGACAACATAAAAAAACTCAAGGATAGAGCCCAAAAACTCGCCAACCAAGCAAGTAATTACGCTGAACCCCCTTGGGCACTCTCTAATTGGATGTCCTGGGTCCTCCCAATTCTTAGTCCTTTAATACCCGTTTTTCTCCTTCTCTTATTCGGACCTTGTGTCTTCCGTTTAGTTTCTCAATTCATNCAAAACCGTATCCAGGCCATCACCAATCATTCTATACGACAAATGCTCCTTCTAACAACCCCACAATATCACCCCTTACCACAAAATCTTCCTTCAGCTTAATCTCTCCCACTCTAGGTTCCCACGCCGCCCCTAATCCCGCTCGAAGCAGCCCTGAGAAACATCGCCCATTATCTCTCNNCATACCACCCCCCAAAAATTTTCGCCGCCCCAACACTTCANCACTATTTTATTTTTCTTATTAATATAAGAAGACAGGAA
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
HERVH | ZNF281 | 470 | 479 | + | 20.07 | GGGGGAGGGG |
HERVH | ZNF93 | 3605 | 3618 | - | 19.17 | GGCGGCGGCAGCCG |
HERVH | EREB29 | 3610 | 3619 | + | 19.06 | GCCGCCGCCC |
HERVH | ZNF148 | 470 | 479 | - | 18.97 | CCCCTCCCCC |
HERVH | Zm00001d049364 | 3608 | 3618 | + | 18.57 | CTGCCGCCGCC |
HERVH | Wt1 | 1386 | 1395 | + | 18.54 | CCTCCCCCAC |
HERVH | PATZ1 | 470 | 480 | + | 18.42 | GGGGGAGGGGC |
HERVH | NR2C2 | 1253 | 1266 | - | 18.39 | GAGGGGAGAGGTCA |
HERVH | ZNF75A | 4749 | 4760 | + | 17.93 | GCCTTTCCCACA |
HERVH | KLF3 | 3530 | 3539 | + | 17.74 | GACCGCGCCC |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.