HERVIP10F
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000186 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Simiiformes |
Length | 7737 |
Kimura value | 13.62 |
Tau index | 0.9514 |
Description | Internal region of ERV1 endogenous retrovirus, HERVIP10F subfamily |
Comment | HERVIP10F is flanked by LTR10F long terminal repeats; these have 5 bp TSDs. HERVIP10F was an active endogenous retrovirus ~30 Myr ago. There are about 20 copies of the HERVIP10F internal sequence present in the human genome (not the same as a simply tally of fragmented hits). Three ORFs are found in the HERVIP10F consensus. ORF1 encodes gagP71A (~504-2216), a 571aa gag-like protein, closely related to gag proteins encoded by the C-type leukemia retroviruses. ORF2 encodes polIP10F (~2220-5720), an 1167aa polyprotein composed of protease (140aa), reverse transcriptase (517aa), RNase H (140aa) and endonuclease (270aa) domains, respectively. ORF3 encodes envIP10F (~5621-7735), a 705aa and envelope protein most similar to those from leukemia retroviruses. |
Sequence |
ATTTGGGGGCTCGTCCGGGATTACATTCCCCTCCGGGGGCGGTCTCCGGTCCTCTCTCGTGAGGAGGCGCGCCCCGCCCCCTTGTGGCGGCCTCAGGGGTGAGAAATCGGGACCCACCCGGTGCGAGGAATAACCCGGGCTCTCAGCAACGCGGAAAGAAACTGGCCGGCAACCTAGCTTAAAGGATCCTCACATACCGCGGCGACGACTCTGTGCACAGACCAAGGAAGGAGAAGCCGCGGGAGCCGGTAAAGTATTTCCTTGGTGGTCGGGACCAAGGTAAGAAAGCCGCGGGGGGGCGGTGAAGTACTCCTTGGTCGGGGTGGCTTAGAGGTTAAAAAGAGGCGAGANATCCCCANTGGNGGGGGATTGAGCCTCACACAAACCTCCAGTAGTAGAAAAGGCAAGAAATTTCCAGTGGGGGAAATTGAGCCTCACCCCAAAAGGCGAGAAATTTCCAGTAAGGGAAATTGAGCCTCGAACCTTACCCCAAAACCATCAAGATGGGAAATACCCCAAGCAAGACAGGGAGCAAGGGGGATAAAGATGGTAACAAAGATATCCCCCCGGATAGCCCCCTAGGTCTCATGCTAAAACACTGGAAGGATAATGAAAGGACTAAACATAGGAAAAAGCAACAAATGATAAAATATTGCTGTTTTATTTGGACTCAGGGACCCATCCTCAAACCCTCAATCTTCTGGCCAAAGTTTGGGTCGAATGAGGATGTAATGTGTCAGCTTCTAATCCGATATGTTAATGATAAAAGTCCAGTGTCTCAAGAAGAACTAGGCTATGCCCTTTGTTGGAGGCAAGGACCTGCCCTCCTTTTTCCCTTAAAAACAAATAGGGAAGAACCCAATCTGGCACCTCAAAATGAAAAGTCAGAGGAGCCAGCTCTCATGCCTAAAGACTCCAGCGCATGGGATCCCCTAGACCATCTTCCCCCGCTCAGTGTCCCCAATCTTTCCCCTCAGACAGCCGCTGCCGCCTCAGATCCCGTTCCAAATCCCCCCTCTACTCACGTTATCCCTCCTCCTTATAACCCTGACTCTTGGGAATTACCGTCCCACCAGCCTGTTCCCTCCCAACCTAAATACCCCTCTCTAAAAGGACTCCAGCGTGAGGTAGAACAATGTAAAAAAGATATTCAGAATTTCCCATTTCCCTCCGTACCTAAGGGGTCAGCCCCGACCCTCTTCCCTTTGAAAGAGGTACCACAAGGAGGGGGGGCGNNCGNCATTGGCTTTGTAAATGCTCCCTTAACCAGTTCAGAAGTCCGGAATTTTAAAAAGGAGCTTAAGCCGCTACTAGATGACCCTTACGGAGTGGCAGACCAAATTGACCAATTCTTAGGACCTCAGTTATACACTTGGGTCGAGTTAATGTCCATCTTGGGCATCCTCTTTTCAGGGGAAGAAAGGAGTATGATTCGTAGGGCTGCTATGGTAGTTTGGGAACGTGAGCACCCTCCCGGTCAAAACGTTCCTACCGCGGACCAGAAATTCCCCGCCCGAGACCCCCGGTGGGACAATAACAACGCAGNTCACCGGGAAAATATGCAGGACCTAAGGGAGATGATAATAAAAGGAATTCGGGAATCAGTACCCCGAACCCAAAATCTTTCTAAAGCATTTGATATACAACAGGAAAAGGATGAAGGGCCTATGAGATTCCTAGACAGACTGAGGGAGCAAATGAGGCAATATGCAGGCCTCGATTTGGANGATCCCCTTGGGCAAGGAATGTTAAAACTCCANTTTGTCACTAAAAGTTGGCCAGACATTTCAAAAAAGTTACAAAAGATAGANAATTGGGAAGACCGNCCCCTAAGTGAGCTTCTCAGGGAAGCTCAGAAAGTATACGTGAGAAGGGACGAAGAAAAACAGAAACAAAAGACAAAACTTATGTTNTCCACCTTCCAACAGATGGCTCCAAACCCANGTACTTCTAAACAGAGCTTCCAGGGGGCCAGAAACTATAAAGGGTCCGAACCCTCCTTTAAAGGACCCNAGCCTCCATCTGGAGGACCAAGGCCCTCGTCTACCAGGCCCCCTAAAGAGTATGGGGGAGCAAGGTCAAAGAATCCCAGAACTGAGAGGGAGGAAGGACAAGATAGGTGCTACAGATGTGGAAGAACAGGCCACTTCAAGAGAGAATGTCCCGAACTAAGAAAGGAGAAAGAAGCCCTTCCACTCATGACTTTCGAGGAAGAATAGGGGGGTCAGGGGCTCTGTCTCTTTTATCTTGAGTCCCACCAGGAGCCCTTGATAAATTTGGAGGTGGGACCTAAACATGAGCTTATCACCTTTTTAGTCGATTCAGGGGCTGCTCGCTCCTCTGTTTGTTTCCCCCCATCTAATGTTGCCTNCTCCTCAGAGGAACTTTTAGTCTCCGGGGTAAAAGGGGAAGGATTTAAAGCAAAAATTTTAGAAAGCACAGAAGTTAGATACCAGGATCGATCAGCTCATATTCAGTTTTTGTTAATCCCTGAAGCAGGAACTAATTTACTAGGGAGGGATTTAATGTTAAAGTTAGGCATAGGCCTACAAGTCAGCCCAAGAGGATTCCTCACCTCATTAAACCTACTCACCACTGCAGATGAAAAATATATTAATCCTAATGTCTGGTCCAAAGAAGGAAACCGAGGGAAACTCCGAGTCCCTCCAATCCACATCAAGCTAAAAACCCCCGGGGAAGTAGTAAGGAGGAAGCAATACCCCATTCCCCTAGAGGGCAGGATAGGGTTGAAACCTATAATTGAAGGTCTTATTAAAGATGGGCTTCTCGAGCCCTGTATGTCCCCTTATAACACCCCAATACTGCCAGTCAAGAAATCAGACGGGTCATACCGGCTGGTACAGGACCTTAGAGCTATCAACCAAATAGTCCAGACTACCCACCCCGTTGTCCCCAATCCTTACACCATTCTCAGCAAGATTCCATATAATCATCAATGGTTTACTGTAATAGATTTGAAGGATGCTTTTTGGGCATGTCCCCTGGCTGAAGATAGCCGAGATATATTTGCTTTTGAGTGGGAGGATCCCCACTCAGGGCGGAAACAACAATATCGATGGACAGTCTTGCCCCAAGGGTTCACAGACTCCCCTAACCTTTTCGGTCAAATTTTAGAACAAGTATTAGAAAAAGTTGTCGTCCCAGAACAAATATGCCTGCTCCAGTACGTGGATGATATTCTTATATCTGGTGAAGATATAGAGAAGGTAGCTGGCTTCTCTACACATATTCTTAACCATCTGCAGTTCGAGGGGCTACGAGTCTCAAAAGGAAAGCTTCAGTATGTAGAGCCTGAAGTTAAATATTTAGGCCACTTAATAAGTGCAGGCAAGCGAAGAATAGGGCCTGAACGAGTTGAGGGAATCGTGTCCCTACCCTTGCCTCAAACTAAACAAGAACTCAGGAAATTTTTAGGGTTAGTCGGATACTGCCGCTTATGGATTGACTCATATGCACTAAACAGTAAACTNTTATATCAAAAACTTGCCCAGGAGAAGCCTGACCGTCTCCTGTGGACTTCTGAGGAAGTCGATCAGGTCGAGGAGCTGAAAGAAAGGCTCATAACTGCCCCTGTCTTAGCCTTACCCTCCCTAGAAAAGCCATTCCACCTTTTTGTCAATGTGGACAATGGGGTAGCTTTAGGAGTGCTNACTCAAGAACACGGAGGCCGCCGGCAGCCCGTGGCCTTCCTATCAAAAGTCTTAGACCCAGTNACCCGTGGATGGCCTCAATGCATCCAATCCGTCGCGGCTACGGCANTACTAGTCGAAGAAAGCAGAAAGTTAACCTTTGGAGGAAAATTGACNGTAAGCACGCCCCACCAAGTTAGAACTATCTTAAACCAGAAAGCAGGGAGGTGGCTTACTGACTCAAGAATCTTAAAGTATGAGGCTATTCTGTTAGAAAAAGATGATTTAACATTAACCACTGATAATTCGCTTAACCCAGCAGGTTTCCTAACAGGGGATCCAAATCTAAAGAGAGAGCACACATGTTTAGATTTAATTGATTACCATACAAAGGTCCGACCAGACCTAGGAGAAACTCCCTTCAGGACGGGACGACACTTATTTATAGATGGTTCCTCCCGGGTGATTGAGGGAAAAAGACACAATGGGTATTCAGTAATTGATGGAGAAACTCTCGNAGAAATAGAGTCAGGAAAATTGCCTAATAATTGGTCTGCCCAAACGTGTGAGCTGTTTGCACTCAGCCAAGCCTTAAAGTACTTACAGAACCAGGAAGGAACCATCTATACCGATTCTAAGTACGCCTTTGGAGTGGCTCATACATTTGGAAAAATTTGGACTGAACGAGGTCTCATTAATAGTAAAGGTCAAGACCTTGTTCACAAGGAGCTAATCACCCAAGTATTGAATAACCTTCAGTTGCCAGAAGAAATAGCTATTGTCCATGTCCCCGGACACCAGAAAAGCCTTTCTTTTGAAAGTCGAGGAAATAACCTAGCAGATCAGATAGCCAAACAGGCTGCCGTTTCTTCTGAAACGCCTATTTTTCACTTAACTCCTTACCTTCCTCCTCCTACCGTAATCCCCATTTTCTCTTCCACTGAAAAAGAGAAACTAATAAAAATAGGTGCTAAAGAGAATTCAGAAGGAAAATGGATATTGCCAGACCAGAGAGAAATGTTATCCAAACCCCTTATGAGGGAAATCTTGTCCCAACTGCATCAAGGGACCCACTGGGGGCCCCAAGCCATGTGTGACGCAGTTCTCAGAGTTTATGGGTGTATAGGAATTTATACCCTGGCCAAACAGGTTACAGACAGTTGCTTAGTATGTAAGAAAACTAATAAACAAACTATAAAAAGATTACCCCTTGGGGGAAGGAGTCCAGGCTTAAGGCCATTCCAAAGTATCCAGATTGATTACACAGAGATGCCTCCAATAGGTCGTCTAAAATATTTACTAGTGATAGTAGATCACCTCACTCACTGGGTCGAAGCTATTCCCTTTTCAAATGCGACGGCCAATAATGTAGTTAAGGCNTTAATTGAAAATATAGTACCCAGGTTTGGACTAATAGAAAACATTGACTCAGACAATGGAACCCATTTCACCGCACACGTCATTAAAAAGCTANCCCAAGCACTAGACATNAGATGGGAATACCATACTCCCTGGCACCCACCTTCATCAGGGAGAGTAGAAAGAATGAACCAGACTCTAAAGAACCACTTAACCAAATTAGTCTTAGAGACTCGGTTGCCATGGACCAAATGCCTTCCTATTGCCCTGTTGAGAATCCGAACTGCCCCNCGGAAAGATATTGGCCTNTCCCCTTATGAGATGCTCTATGGATTGCCTTATTTACACTCCACTGCTGACATTCCTACNTTTGAAACAAAAGATCAGTTTCTCAGAAATTATATACTTGGTCTATCTTCCACTTTCTCTTCCCTCAGAACTAAAGGTCTTTTAGCACAGGCGCCACCCCTGGAGTTCCCAGTACACCAACATCAGCCTGGGGATCACGTCCTCATCAAAAGCTGGAAAGAGGGAAAGCTCGAGCCGGCCTGGGAAGGACCTTACCTAGTGCTCCTAACTACCGAAACCGCAGTCCGGACAGCAGAAAGAGGATGGACCCATCACACCCGAGTCAAGAAAGCGCCGCCCCCTCCAGAGTCATGGGCCATNGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAGTTTAACTCTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATCGCTGACCACCTNGTTATTAACGTAACCAAGTCAATTTCGCCTCAAACTATTACATTTGATGCTTGCCTTGTTATACCCTGTGGAGACTTGCCAAGTCAAAGGCAGCTCTCCACTTCAGAAAAGTACCTCTGTCCTTCCTGGCTCTCCTCAGACTGGGCATTAGTGAATTGGGATCATTTAGTCTGGGAAGATTTCGATGAAGACCCCAGTGTCAACCGGGAGTCTTGCCCCCCCGACGCAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACCAAAGAGCAAGGATGGACTGCCCCAACCAGTANTTGTAATTTCCTAAAACCATACATTCATTTTACTAAAGGAACAGCCCCCCCCAACTGTCAGCTAAACCAGTGCAATCCAATACAGGTTATTATCTCGANCCCCCAAAGTTCTTCCCCTTCTCTAAGCCGGTTCCCTTCTTTAAGCCGNTTCTATGGTATGGGGGCTGAGGTTTCAGGAACGGACCCTATNGGATCCTTTGAAATGCGCTTCATTGCTCCCCCACCGCCTGCACCTCCCTCTAAGCCTTCTTCCAAAACCTCTCACAACGAAACCGTCGTTCCTCCTCCACCCAATGACAAGACCAAGGTAGCTATTGTAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTGAGACAGGATATCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCGTCCGCACGTTAAACAAAAGCAATTGTTACGCTTGTGCGCACGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAGGGTGGTCCTCCAGTCGACCGGGCATGGGCTGTATGGTAGCTCTTTTCCAGGATTCCACAGCCTGGGGTAACGAGTCGTGCCAAGCTCTCTCTCTGCTATATCCCGAAGTCCGACACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCGACGCTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTAAAGGGATGCAGTGAGCTTAAGACTTTCCAAGAGCTTACCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGACACTCTGCCAAGTAACTGGAGCGGCACTTGTGCTCTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGAAAGGAAAAATACGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTTTATTTAGACGCAATTGGAGTCCCACGGGGAGTACCAGATNAATTTAAAGCCCGAGATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAATTAATAAAAATGTAGATTGGATAAATTACATCTATTACAACCAACAGCGGTTTATTAACTACACTAGAGATGCTGTTAAAGGAATAGCTGAGCAATTAGGGCCTACTAGCCAGATGGCTTGGGAAAATAGAATAGCCCTAGACATGATATTAGCNGAAAAAGGNGGAGTTTGCGTCATGATTAAAACTCAATGTTGTACCTTCATCCCAAACAACACCGCCCCCGATGGAAGTATAACAAAGGCNTTGCAAGGNCTNACCGCTCTATCCAATGAGTTAGCCAAAAACTCNGGGGTAAATGACCCCTTTACAGGATGGCTAGAAAAGTGGTTCGGTAAATGGAAAGGAATCATAGCCTCAATTCTTACTTCCCTCGCAGCCGTAATAGGTGTACTCATTCTTGTCGGGTGCTGTGTCATACCATGCATCCGTGGGCTGGTGCAAAGGCTCATAGAAACGGCACTTACTAAAACCTCCCTTAGCTNTCCTCCACCTTATTCAGANAAGCTTCTTCTTTTAGAGGATCAAGCAGAACAACNAAGCCAAGACATGTTAAAAAAGTTTGAAGAGAAAGCTGTAAGAAAANTGCAAGAGGAGGAAAT
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
HERVIP10F | CTCF | 74 | 88 | - | 21.34 | GCCACAAGGGGGCGG |
HERVIP10F | opa | 289 | 300 | - | 19.96 | GCCCCCCCGCGG |
HERVIP10F | Spps | 71 | 81 | + | 19.34 | GCCCCGCCCCC |
HERVIP10F | PATZ1 | 71 | 81 | - | 19.16 | GGGGGCGGGGC |
HERVIP10F | DREB2G | 6305 | 6318 | - | 19.00 | GGTGCAGGCGGTGG |
HERVIP10F | THI2 | 5333 | 5347 | - | 18.92 | GGCAATCCATAGAGC |
HERVIP10F | sug | 289 | 300 | - | 18.59 | GCCCCCCCGCGG |
HERVIP10F | ZNF281 | 72 | 81 | - | 18.37 | GGGGGCGGGG |
HERVIP10F | KLF16 | 71 | 81 | + | 18.24 | GCCCCGCCCCC |
HERVIP10F | SP3 | 71 | 81 | + | 18.17 | GCCCCGCCCCC |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.