HERVIP10F
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000186 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Simiiformes |
Length | 7737 |
Kimura value | 13.62 |
Tau index | 0.9514 |
Description | Internal region of ERV1 endogenous retrovirus, HERVIP10F subfamily |
Comment | HERVIP10F is flanked by LTR10F long terminal repeats; these have 5 bp TSDs. HERVIP10F was an active endogenous retrovirus ~30 Myr ago. There are about 20 copies of the HERVIP10F internal sequence present in the human genome (not the same as a simply tally of fragmented hits). Three ORFs are found in the HERVIP10F consensus. ORF1 encodes gagP71A (~504-2216), a 571aa gag-like protein, closely related to gag proteins encoded by the C-type leukemia retroviruses. ORF2 encodes polIP10F (~2220-5720), an 1167aa polyprotein composed of protease (140aa), reverse transcriptase (517aa), RNase H (140aa) and endonuclease (270aa) domains, respectively. ORF3 encodes envIP10F (~5621-7735), a 705aa and envelope protein most similar to those from leukemia retroviruses. |
Sequence |
ATTTGGGGGCTCGTCCGGGATTACATTCCCCTCCGGGGGCGGTCTCCGGTCCTCTCTCGTGAGGAGGCGCGCCCCGCCCCCTTGTGGCGGCCTCAGGGGTGAGAAATCGGGACCCACCCGGTGCGAGGAATAACCCGGGCTCTCAGCAACGCGGAAAGAAACTGGCCGGCAACCTAGCTTAAAGGATCCTCACATACCGCGGCGACGACTCTGTGCACAGACCAAGGAAGGAGAAGCCGCGGGAGCCGGTAAAGTATTTCCTTGGTGGTCGGGACCAAGGTAAGAAAGCCGCGGGGGGGCGGTGAAGTACTCCTTGGTCGGGGTGGCTTAGAGGTTAAAAAGAGGCGAGANATCCCCANTGGNGGGGGATTGAGCCTCACACAAACCTCCAGTAGTAGAAAAGGCAAGAAATTTCCAGTGGGGGAAATTGAGCCTCACCCCAAAAGGCGAGAAATTTCCAGTAAGGGAAATTGAGCCTCGAACCTTACCCCAAAACCATCAAGATGGGAAATACCCCAAGCAAGACAGGGAGCAAGGGGGATAAAGATGGTAACAAAGATATCCCCCCGGATAGCCCCCTAGGTCTCATGCTAAAACACTGGAAGGATAATGAAAGGACTAAACATAGGAAAAAGCAACAAATGATAAAATATTGCTGTTTTATTTGGACTCAGGGACCCATCCTCAAACCCTCAATCTTCTGGCCAAAGTTTGGGTCGAATGAGGATGTAATGTGTCAGCTTCTAATCCGATATGTTAATGATAAAAGTCCAGTGTCTCAAGAAGAACTAGGCTATGCCCTTTGTTGGAGGCAAGGACCTGCCCTCCTTTTTCCCTTAAAAACAAATAGGGAAGAACCCAATCTGGCACCTCAAAATGAAAAGTCAGAGGAGCCAGCTCTCATGCCTAAAGACTCCAGCGCATGGGATCCCCTAGACCATCTTCCCCCGCTCAGTGTCCCCAATCTTTCCCCTCAGACAGCCGCTGCCGCCTCAGATCCCGTTCCAAATCCCCCCTCTACTCACGTTATCCCTCCTCCTTATAACCCTGACTCTTGGGAATTACCGTCCCACCAGCCTGTTCCCTCCCAACCTAAATACCCCTCTCTAAAAGGACTCCAGCGTGAGGTAGAACAATGTAAAAAAGATATTCAGAATTTCCCATTTCCCTCCGTACCTAAGGGGTCAGCCCCGACCCTCTTCCCTTTGAAAGAGGTACCACAAGGAGGGGGGGCGNNCGNCATTGGCTTTGTAAATGCTCCCTTAACCAGTTCAGAAGTCCGGAATTTTAAAAAGGAGCTTAAGCCGCTACTAGATGACCCTTACGGAGTGGCAGACCAAATTGACCAATTCTTAGGACCTCAGTTATACACTTGGGTCGAGTTAATGTCCATCTTGGGCATCCTCTTTTCAGGGGAAGAAAGGAGTATGATTCGTAGGGCTGCTATGGTAGTTTGGGAACGTGAGCACCCTCCCGGTCAAAACGTTCCTACCGCGGACCAGAAATTCCCCGCCCGAGACCCCCGGTGGGACAATAACAACGCAGNTCACCGGGAAAATATGCAGGACCTAAGGGAGATGATAATAAAAGGAATTCGGGAATCAGTACCCCGAACCCAAAATCTTTCTAAAGCATTTGATATACAACAGGAAAAGGATGAAGGGCCTATGAGATTCCTAGACAGACTGAGGGAGCAAATGAGGCAATATGCAGGCCTCGATTTGGANGATCCCCTTGGGCAAGGAATGTTAAAACTCCANTTTGTCACTAAAAGTTGGCCAGACATTTCAAAAAAGTTACAAAAGATAGANAATTGGGAAGACCGNCCCCTAAGTGAGCTTCTCAGGGAAGCTCAGAAAGTATACGTGAGAAGGGACGAAGAAAAACAGAAACAAAAGACAAAACTTATGTTNTCCACCTTCCAACAGATGGCTCCAAACCCANGTACTTCTAAACAGAGCTTCCAGGGGGCCAGAAACTATAAAGGGTCCGAACCCTCCTTTAAAGGACCCNAGCCTCCATCTGGAGGACCAAGGCCCTCGTCTACCAGGCCCCCTAAAGAGTATGGGGGAGCAAGGTCAAAGAATCCCAGAACTGAGAGGGAGGAAGGACAAGATAGGTGCTACAGATGTGGAAGAACAGGCCACTTCAAGAGAGAATGTCCCGAACTAAGAAAGGAGAAAGAAGCCCTTCCACTCATGACTTTCGAGGAAGAATAGGGGGGTCAGGGGCTCTGTCTCTTTTATCTTGAGTCCCACCAGGAGCCCTTGATAAATTTGGAGGTGGGACCTAAACATGAGCTTATCACCTTTTTAGTCGATTCAGGGGCTGCTCGCTCCTCTGTTTGTTTCCCCCCATCTAATGTTGCCTNCTCCTCAGAGGAACTTTTAGTCTCCGGGGTAAAAGGGGAAGGATTTAAAGCAAAAATTTTAGAAAGCACAGAAGTTAGATACCAGGATCGATCAGCTCATATTCAGTTTTTGTTAATCCCTGAAGCAGGAACTAATTTACTAGGGAGGGATTTAATGTTAAAGTTAGGCATAGGCCTACAAGTCAGCCCAAGAGGATTCCTCACCTCATTAAACCTACTCACCACTGCAGATGAAAAATATATTAATCCTAATGTCTGGTCCAAAGAAGGAAACCGAGGGAAACTCCGAGTCCCTCCAATCCACATCAAGCTAAAAACCCCCGGGGAAGTAGTAAGGAGGAAGCAATACCCCATTCCCCTAGAGGGCAGGATAGGGTTGAAACCTATAATTGAAGGTCTTATTAAAGATGGGCTTCTCGAGCCCTGTATGTCCCCTTATAACACCCCAATACTGCCAGTCAAGAAATCAGACGGGTCATACCGGCTGGTACAGGACCTTAGAGCTATCAACCAAATAGTCCAGACTACCCACCCCGTTGTCCCCAATCCTTACACCATTCTCAGCAAGATTCCATATAATCATCAATGGTTTACTGTAATAGATTTGAAGGATGCTTTTTGGGCATGTCCCCTGGCTGAAGATAGCCGAGATATATTTGCTTTTGAGTGGGAGGATCCCCACTCAGGGCGGAAACAACAATATCGATGGACAGTCTTGCCCCAAGGGTTCACAGACTCCCCTAACCTTTTCGGTCAAATTTTAGAACAAGTATTAGAAAAAGTTGTCGTCCCAGAACAAATATGCCTGCTCCAGTACGTGGATGATATTCTTATATCTGGTGAAGATATAGAGAAGGTAGCTGGCTTCTCTACACATATTCTTAACCATCTGCAGTTCGAGGGGCTACGAGTCTCAAAAGGAAAGCTTCAGTATGTAGAGCCTGAAGTTAAATATTTAGGCCACTTAATAAGTGCAGGCAAGCGAAGAATAGGGCCTGAACGAGTTGAGGGAATCGTGTCCCTACCCTTGCCTCAAACTAAACAAGAACTCAGGAAATTTTTAGGGTTAGTCGGATACTGCCGCTTATGGATTGACTCATATGCACTAAACAGTAAACTNTTATATCAAAAACTTGCCCAGGAGAAGCCTGACCGTCTCCTGTGGACTTCTGAGGAAGTCGATCAGGTCGAGGAGCTGAAAGAAAGGCTCATAACTGCCCCTGTCTTAGCCTTACCCTCCCTAGAAAAGCCATTCCACCTTTTTGTCAATGTGGACAATGGGGTAGCTTTAGGAGTGCTNACTCAAGAACACGGAGGCCGCCGGCAGCCCGTGGCCTTCCTATCAAAAGTCTTAGACCCAGTNACCCGTGGATGGCCTCAATGCATCCAATCCGTCGCGGCTACGGCANTACTAGTCGAAGAAAGCAGAAAGTTAACCTTTGGAGGAAAATTGACNGTAAGCACGCCCCACCAAGTTAGAACTATCTTAAACCAGAAAGCAGGGAGGTGGCTTACTGACTCAAGAATCTTAAAGTATGAGGCTATTCTGTTAGAAAAAGATGATTTAACATTAACCACTGATAATTCGCTTAACCCAGCAGGTTTCCTAACAGGGGATCCAAATCTAAAGAGAGAGCACACATGTTTAGATTTAATTGATTACCATACAAAGGTCCGACCAGACCTAGGAGAAACTCCCTTCAGGACGGGACGACACTTATTTATAGATGGTTCCTCCCGGGTGATTGAGGGAAAAAGACACAATGGGTATTCAGTAATTGATGGAGAAACTCTCGNAGAAATAGAGTCAGGAAAATTGCCTAATAATTGGTCTGCCCAAACGTGTGAGCTGTTTGCACTCAGCCAAGCCTTAAAGTACTTACAGAACCAGGAAGGAACCATCTATACCGATTCTAAGTACGCCTTTGGAGTGGCTCATACATTTGGAAAAATTTGGACTGAACGAGGTCTCATTAATAGTAAAGGTCAAGACCTTGTTCACAAGGAGCTAATCACCCAAGTATTGAATAACCTTCAGTTGCCAGAAGAAATAGCTATTGTCCATGTCCCCGGACACCAGAAAAGCCTTTCTTTTGAAAGTCGAGGAAATAACCTAGCAGATCAGATAGCCAAACAGGCTGCCGTTTCTTCTGAAACGCCTATTTTTCACTTAACTCCTTACCTTCCTCCTCCTACCGTAATCCCCATTTTCTCTTCCACTGAAAAAGAGAAACTAATAAAAATAGGTGCTAAAGAGAATTCAGAAGGAAAATGGATATTGCCAGACCAGAGAGAAATGTTATCCAAACCCCTTATGAGGGAAATCTTGTCCCAACTGCATCAAGGGACCCACTGGGGGCCCCAAGCCATGTGTGACGCAGTTCTCAGAGTTTATGGGTGTATAGGAATTTATACCCTGGCCAAACAGGTTACAGACAGTTGCTTAGTATGTAAGAAAACTAATAAACAAACTATAAAAAGATTACCCCTTGGGGGAAGGAGTCCAGGCTTAAGGCCATTCCAAAGTATCCAGATTGATTACACAGAGATGCCTCCAATAGGTCGTCTAAAATATTTACTAGTGATAGTAGATCACCTCACTCACTGGGTCGAAGCTATTCCCTTTTCAAATGCGACGGCCAATAATGTAGTTAAGGCNTTAATTGAAAATATAGTACCCAGGTTTGGACTAATAGAAAACATTGACTCAGACAATGGAACCCATTTCACCGCACACGTCATTAAAAAGCTANCCCAAGCACTAGACATNAGATGGGAATACCATACTCCCTGGCACCCACCTTCATCAGGGAGAGTAGAAAGAATGAACCAGACTCTAAAGAACCACTTAACCAAATTAGTCTTAGAGACTCGGTTGCCATGGACCAAATGCCTTCCTATTGCCCTGTTGAGAATCCGAACTGCCCCNCGGAAAGATATTGGCCTNTCCCCTTATGAGATGCTCTATGGATTGCCTTATTTACACTCCACTGCTGACATTCCTACNTTTGAAACAAAAGATCAGTTTCTCAGAAATTATATACTTGGTCTATCTTCCACTTTCTCTTCCCTCAGAACTAAAGGTCTTTTAGCACAGGCGCCACCCCTGGAGTTCCCAGTACACCAACATCAGCCTGGGGATCACGTCCTCATCAAAAGCTGGAAAGAGGGAAAGCTCGAGCCGGCCTGGGAAGGACCTTACCTAGTGCTCCTAACTACCGAAACCGCAGTCCGGACAGCAGAAAGAGGATGGACCCATCACACCCGAGTCAAGAAAGCGCCGCCCCCTCCAGAGTCATGGGCCATNGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAGTTTAACTCTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATCGCTGACCACCTNGTTATTAACGTAACCAAGTCAATTTCGCCTCAAACTATTACATTTGATGCTTGCCTTGTTATACCCTGTGGAGACTTGCCAAGTCAAAGGCAGCTCTCCACTTCAGAAAAGTACCTCTGTCCTTCCTGGCTCTCCTCAGACTGGGCATTAGTGAATTGGGATCATTTAGTCTGGGAAGATTTCGATGAAGACCCCAGTGTCAACCGGGAGTCTTGCCCCCCCGACGCAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACCAAAGAGCAAGGATGGACTGCCCCAACCAGTANTTGTAATTTCCTAAAACCATACATTCATTTTACTAAAGGAACAGCCCCCCCCAACTGTCAGCTAAACCAGTGCAATCCAATACAGGTTATTATCTCGANCCCCCAAAGTTCTTCCCCTTCTCTAAGCCGGTTCCCTTCTTTAAGCCGNTTCTATGGTATGGGGGCTGAGGTTTCAGGAACGGACCCTATNGGATCCTTTGAAATGCGCTTCATTGCTCCCCCACCGCCTGCACCTCCCTCTAAGCCTTCTTCCAAAACCTCTCACAACGAAACCGTCGTTCCTCCTCCACCCAATGACAAGACCAAGGTAGCTATTGTAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTGAGACAGGATATCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCGTCCGCACGTTAAACAAAAGCAATTGTTACGCTTGTGCGCACGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAGGGTGGTCCTCCAGTCGACCGGGCATGGGCTGTATGGTAGCTCTTTTCCAGGATTCCACAGCCTGGGGTAACGAGTCGTGCCAAGCTCTCTCTCTGCTATATCCCGAAGTCCGACACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCGACGCTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTAAAGGGATGCAGTGAGCTTAAGACTTTCCAAGAGCTTACCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGACACTCTGCCAAGTAACTGGAGCGGCACTTGTGCTCTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGAAAGGAAAAATACGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTTTATTTAGACGCAATTGGAGTCCCACGGGGAGTACCAGATNAATTTAAAGCCCGAGATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAATTAATAAAAATGTAGATTGGATAAATTACATCTATTACAACCAACAGCGGTTTATTAACTACACTAGAGATGCTGTTAAAGGAATAGCTGAGCAATTAGGGCCTACTAGCCAGATGGCTTGGGAAAATAGAATAGCCCTAGACATGATATTAGCNGAAAAAGGNGGAGTTTGCGTCATGATTAAAACTCAATGTTGTACCTTCATCCCAAACAACACCGCCCCCGATGGAAGTATAACAAAGGCNTTGCAAGGNCTNACCGCTCTATCCAATGAGTTAGCCAAAAACTCNGGGGTAAATGACCCCTTTACAGGATGGCTAGAAAAGTGGTTCGGTAAATGGAAAGGAATCATAGCCTCAATTCTTACTTCCCTCGCAGCCGTAATAGGTGTACTCATTCTTGTCGGGTGCTGTGTCATACCATGCATCCGTGGGCTGGTGCAAAGGCTCATAGAAACGGCACTTACTAAAACCTCCCTTAGCTNTCCTCCACCTTATTCAGANAAGCTTCTTCTTTTAGAGGATCAAGCAGAACAACNAAGCCAAGACATGTTAAAAAAGTTTGAAGAGAAAGCTGTAAGAAAANTGCAAGAGGAGGAAAT
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
HERVIP10F | Pax7 | 4024 | 4033 | - | 16.63 | TAATCAATTA |
HERVIP10F | ZIC5 | 286 | 300 | - | 16.62 | GCCCCCCCGCGGCTT |
HERVIP10F | ZFP14 | 5903 | 5917 | - | 16.52 | GGAGAGCCAGGAAGG |
HERVIP10F | lsl-1 | 4564 | 4574 | + | 16.51 | CTACCGTAATC |
HERVIP10F | NAC094 | 3764 | 3779 | - | 16.49 | TGCCGTAGCCGCGACG |
HERVIP10F | PRDM9 | 527 | 546 | + | 16.39 | AGGGAGCAAGGGGGATAAAG |
HERVIP10F | ZBTB7B | 262 | 271 | - | 16.33 | CGACCACCAA |
HERVIP10F | SP9 | 72 | 81 | + | 16.33 | CCCCGCCCCC |
HERVIP10F | KLF15 | 1508 | 1515 | + | 16.27 | CCCCGCCC |
HERVIP10F | KLF15 | 72 | 79 | + | 16.27 | CCCCGCCC |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.