HERVIP10F
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000186 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Simiiformes |
Length | 7737 |
Kimura value | 13.62 |
Tau index | 0.9514 |
Description | Internal region of ERV1 endogenous retrovirus, HERVIP10F subfamily |
Comment | HERVIP10F is flanked by LTR10F long terminal repeats; these have 5 bp TSDs. HERVIP10F was an active endogenous retrovirus ~30 Myr ago. There are about 20 copies of the HERVIP10F internal sequence present in the human genome (not the same as a simply tally of fragmented hits). Three ORFs are found in the HERVIP10F consensus. ORF1 encodes gagP71A (~504-2216), a 571aa gag-like protein, closely related to gag proteins encoded by the C-type leukemia retroviruses. ORF2 encodes polIP10F (~2220-5720), an 1167aa polyprotein composed of protease (140aa), reverse transcriptase (517aa), RNase H (140aa) and endonuclease (270aa) domains, respectively. ORF3 encodes envIP10F (~5621-7735), a 705aa and envelope protein most similar to those from leukemia retroviruses. |
Sequence |
ATTTGGGGGCTCGTCCGGGATTACATTCCCCTCCGGGGGCGGTCTCCGGTCCTCTCTCGTGAGGAGGCGCGCCCCGCCCCCTTGTGGCGGCCTCAGGGGTGAGAAATCGGGACCCACCCGGTGCGAGGAATAACCCGGGCTCTCAGCAACGCGGAAAGAAACTGGCCGGCAACCTAGCTTAAAGGATCCTCACATACCGCGGCGACGACTCTGTGCACAGACCAAGGAAGGAGAAGCCGCGGGAGCCGGTAAAGTATTTCCTTGGTGGTCGGGACCAAGGTAAGAAAGCCGCGGGGGGGCGGTGAAGTACTCCTTGGTCGGGGTGGCTTAGAGGTTAAAAAGAGGCGAGANATCCCCANTGGNGGGGGATTGAGCCTCACACAAACCTCCAGTAGTAGAAAAGGCAAGAAATTTCCAGTGGGGGAAATTGAGCCTCACCCCAAAAGGCGAGAAATTTCCAGTAAGGGAAATTGAGCCTCGAACCTTACCCCAAAACCATCAAGATGGGAAATACCCCAAGCAAGACAGGGAGCAAGGGGGATAAAGATGGTAACAAAGATATCCCCCCGGATAGCCCCCTAGGTCTCATGCTAAAACACTGGAAGGATAATGAAAGGACTAAACATAGGAAAAAGCAACAAATGATAAAATATTGCTGTTTTATTTGGACTCAGGGACCCATCCTCAAACCCTCAATCTTCTGGCCAAAGTTTGGGTCGAATGAGGATGTAATGTGTCAGCTTCTAATCCGATATGTTAATGATAAAAGTCCAGTGTCTCAAGAAGAACTAGGCTATGCCCTTTGTTGGAGGCAAGGACCTGCCCTCCTTTTTCCCTTAAAAACAAATAGGGAAGAACCCAATCTGGCACCTCAAAATGAAAAGTCAGAGGAGCCAGCTCTCATGCCTAAAGACTCCAGCGCATGGGATCCCCTAGACCATCTTCCCCCGCTCAGTGTCCCCAATCTTTCCCCTCAGACAGCCGCTGCCGCCTCAGATCCCGTTCCAAATCCCCCCTCTACTCACGTTATCCCTCCTCCTTATAACCCTGACTCTTGGGAATTACCGTCCCACCAGCCTGTTCCCTCCCAACCTAAATACCCCTCTCTAAAAGGACTCCAGCGTGAGGTAGAACAATGTAAAAAAGATATTCAGAATTTCCCATTTCCCTCCGTACCTAAGGGGTCAGCCCCGACCCTCTTCCCTTTGAAAGAGGTACCACAAGGAGGGGGGGCGNNCGNCATTGGCTTTGTAAATGCTCCCTTAACCAGTTCAGAAGTCCGGAATTTTAAAAAGGAGCTTAAGCCGCTACTAGATGACCCTTACGGAGTGGCAGACCAAATTGACCAATTCTTAGGACCTCAGTTATACACTTGGGTCGAGTTAATGTCCATCTTGGGCATCCTCTTTTCAGGGGAAGAAAGGAGTATGATTCGTAGGGCTGCTATGGTAGTTTGGGAACGTGAGCACCCTCCCGGTCAAAACGTTCCTACCGCGGACCAGAAATTCCCCGCCCGAGACCCCCGGTGGGACAATAACAACGCAGNTCACCGGGAAAATATGCAGGACCTAAGGGAGATGATAATAAAAGGAATTCGGGAATCAGTACCCCGAACCCAAAATCTTTCTAAAGCATTTGATATACAACAGGAAAAGGATGAAGGGCCTATGAGATTCCTAGACAGACTGAGGGAGCAAATGAGGCAATATGCAGGCCTCGATTTGGANGATCCCCTTGGGCAAGGAATGTTAAAACTCCANTTTGTCACTAAAAGTTGGCCAGACATTTCAAAAAAGTTACAAAAGATAGANAATTGGGAAGACCGNCCCCTAAGTGAGCTTCTCAGGGAAGCTCAGAAAGTATACGTGAGAAGGGACGAAGAAAAACAGAAACAAAAGACAAAACTTATGTTNTCCACCTTCCAACAGATGGCTCCAAACCCANGTACTTCTAAACAGAGCTTCCAGGGGGCCAGAAACTATAAAGGGTCCGAACCCTCCTTTAAAGGACCCNAGCCTCCATCTGGAGGACCAAGGCCCTCGTCTACCAGGCCCCCTAAAGAGTATGGGGGAGCAAGGTCAAAGAATCCCAGAACTGAGAGGGAGGAAGGACAAGATAGGTGCTACAGATGTGGAAGAACAGGCCACTTCAAGAGAGAATGTCCCGAACTAAGAAAGGAGAAAGAAGCCCTTCCACTCATGACTTTCGAGGAAGAATAGGGGGGTCAGGGGCTCTGTCTCTTTTATCTTGAGTCCCACCAGGAGCCCTTGATAAATTTGGAGGTGGGACCTAAACATGAGCTTATCACCTTTTTAGTCGATTCAGGGGCTGCTCGCTCCTCTGTTTGTTTCCCCCCATCTAATGTTGCCTNCTCCTCAGAGGAACTTTTAGTCTCCGGGGTAAAAGGGGAAGGATTTAAAGCAAAAATTTTAGAAAGCACAGAAGTTAGATACCAGGATCGATCAGCTCATATTCAGTTTTTGTTAATCCCTGAAGCAGGAACTAATTTACTAGGGAGGGATTTAATGTTAAAGTTAGGCATAGGCCTACAAGTCAGCCCAAGAGGATTCCTCACCTCATTAAACCTACTCACCACTGCAGATGAAAAATATATTAATCCTAATGTCTGGTCCAAAGAAGGAAACCGAGGGAAACTCCGAGTCCCTCCAATCCACATCAAGCTAAAAACCCCCGGGGAAGTAGTAAGGAGGAAGCAATACCCCATTCCCCTAGAGGGCAGGATAGGGTTGAAACCTATAATTGAAGGTCTTATTAAAGATGGGCTTCTCGAGCCCTGTATGTCCCCTTATAACACCCCAATACTGCCAGTCAAGAAATCAGACGGGTCATACCGGCTGGTACAGGACCTTAGAGCTATCAACCAAATAGTCCAGACTACCCACCCCGTTGTCCCCAATCCTTACACCATTCTCAGCAAGATTCCATATAATCATCAATGGTTTACTGTAATAGATTTGAAGGATGCTTTTTGGGCATGTCCCCTGGCTGAAGATAGCCGAGATATATTTGCTTTTGAGTGGGAGGATCCCCACTCAGGGCGGAAACAACAATATCGATGGACAGTCTTGCCCCAAGGGTTCACAGACTCCCCTAACCTTTTCGGTCAAATTTTAGAACAAGTATTAGAAAAAGTTGTCGTCCCAGAACAAATATGCCTGCTCCAGTACGTGGATGATATTCTTATATCTGGTGAAGATATAGAGAAGGTAGCTGGCTTCTCTACACATATTCTTAACCATCTGCAGTTCGAGGGGCTACGAGTCTCAAAAGGAAAGCTTCAGTATGTAGAGCCTGAAGTTAAATATTTAGGCCACTTAATAAGTGCAGGCAAGCGAAGAATAGGGCCTGAACGAGTTGAGGGAATCGTGTCCCTACCCTTGCCTCAAACTAAACAAGAACTCAGGAAATTTTTAGGGTTAGTCGGATACTGCCGCTTATGGATTGACTCATATGCACTAAACAGTAAACTNTTATATCAAAAACTTGCCCAGGAGAAGCCTGACCGTCTCCTGTGGACTTCTGAGGAAGTCGATCAGGTCGAGGAGCTGAAAGAAAGGCTCATAACTGCCCCTGTCTTAGCCTTACCCTCCCTAGAAAAGCCATTCCACCTTTTTGTCAATGTGGACAATGGGGTAGCTTTAGGAGTGCTNACTCAAGAACACGGAGGCCGCCGGCAGCCCGTGGCCTTCCTATCAAAAGTCTTAGACCCAGTNACCCGTGGATGGCCTCAATGCATCCAATCCGTCGCGGCTACGGCANTACTAGTCGAAGAAAGCAGAAAGTTAACCTTTGGAGGAAAATTGACNGTAAGCACGCCCCACCAAGTTAGAACTATCTTAAACCAGAAAGCAGGGAGGTGGCTTACTGACTCAAGAATCTTAAAGTATGAGGCTATTCTGTTAGAAAAAGATGATTTAACATTAACCACTGATAATTCGCTTAACCCAGCAGGTTTCCTAACAGGGGATCCAAATCTAAAGAGAGAGCACACATGTTTAGATTTAATTGATTACCATACAAAGGTCCGACCAGACCTAGGAGAAACTCCCTTCAGGACGGGACGACACTTATTTATAGATGGTTCCTCCCGGGTGATTGAGGGAAAAAGACACAATGGGTATTCAGTAATTGATGGAGAAACTCTCGNAGAAATAGAGTCAGGAAAATTGCCTAATAATTGGTCTGCCCAAACGTGTGAGCTGTTTGCACTCAGCCAAGCCTTAAAGTACTTACAGAACCAGGAAGGAACCATCTATACCGATTCTAAGTACGCCTTTGGAGTGGCTCATACATTTGGAAAAATTTGGACTGAACGAGGTCTCATTAATAGTAAAGGTCAAGACCTTGTTCACAAGGAGCTAATCACCCAAGTATTGAATAACCTTCAGTTGCCAGAAGAAATAGCTATTGTCCATGTCCCCGGACACCAGAAAAGCCTTTCTTTTGAAAGTCGAGGAAATAACCTAGCAGATCAGATAGCCAAACAGGCTGCCGTTTCTTCTGAAACGCCTATTTTTCACTTAACTCCTTACCTTCCTCCTCCTACCGTAATCCCCATTTTCTCTTCCACTGAAAAAGAGAAACTAATAAAAATAGGTGCTAAAGAGAATTCAGAAGGAAAATGGATATTGCCAGACCAGAGAGAAATGTTATCCAAACCCCTTATGAGGGAAATCTTGTCCCAACTGCATCAAGGGACCCACTGGGGGCCCCAAGCCATGTGTGACGCAGTTCTCAGAGTTTATGGGTGTATAGGAATTTATACCCTGGCCAAACAGGTTACAGACAGTTGCTTAGTATGTAAGAAAACTAATAAACAAACTATAAAAAGATTACCCCTTGGGGGAAGGAGTCCAGGCTTAAGGCCATTCCAAAGTATCCAGATTGATTACACAGAGATGCCTCCAATAGGTCGTCTAAAATATTTACTAGTGATAGTAGATCACCTCACTCACTGGGTCGAAGCTATTCCCTTTTCAAATGCGACGGCCAATAATGTAGTTAAGGCNTTAATTGAAAATATAGTACCCAGGTTTGGACTAATAGAAAACATTGACTCAGACAATGGAACCCATTTCACCGCACACGTCATTAAAAAGCTANCCCAAGCACTAGACATNAGATGGGAATACCATACTCCCTGGCACCCACCTTCATCAGGGAGAGTAGAAAGAATGAACCAGACTCTAAAGAACCACTTAACCAAATTAGTCTTAGAGACTCGGTTGCCATGGACCAAATGCCTTCCTATTGCCCTGTTGAGAATCCGAACTGCCCCNCGGAAAGATATTGGCCTNTCCCCTTATGAGATGCTCTATGGATTGCCTTATTTACACTCCACTGCTGACATTCCTACNTTTGAAACAAAAGATCAGTTTCTCAGAAATTATATACTTGGTCTATCTTCCACTTTCTCTTCCCTCAGAACTAAAGGTCTTTTAGCACAGGCGCCACCCCTGGAGTTCCCAGTACACCAACATCAGCCTGGGGATCACGTCCTCATCAAAAGCTGGAAAGAGGGAAAGCTCGAGCCGGCCTGGGAAGGACCTTACCTAGTGCTCCTAACTACCGAAACCGCAGTCCGGACAGCAGAAAGAGGATGGACCCATCACACCCGAGTCAAGAAAGCGCCGCCCCCTCCAGAGTCATGGGCCATNGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAGTTTAACTCTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATCGCTGACCACCTNGTTATTAACGTAACCAAGTCAATTTCGCCTCAAACTATTACATTTGATGCTTGCCTTGTTATACCCTGTGGAGACTTGCCAAGTCAAAGGCAGCTCTCCACTTCAGAAAAGTACCTCTGTCCTTCCTGGCTCTCCTCAGACTGGGCATTAGTGAATTGGGATCATTTAGTCTGGGAAGATTTCGATGAAGACCCCAGTGTCAACCGGGAGTCTTGCCCCCCCGACGCAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACCAAAGAGCAAGGATGGACTGCCCCAACCAGTANTTGTAATTTCCTAAAACCATACATTCATTTTACTAAAGGAACAGCCCCCCCCAACTGTCAGCTAAACCAGTGCAATCCAATACAGGTTATTATCTCGANCCCCCAAAGTTCTTCCCCTTCTCTAAGCCGGTTCCCTTCTTTAAGCCGNTTCTATGGTATGGGGGCTGAGGTTTCAGGAACGGACCCTATNGGATCCTTTGAAATGCGCTTCATTGCTCCCCCACCGCCTGCACCTCCCTCTAAGCCTTCTTCCAAAACCTCTCACAACGAAACCGTCGTTCCTCCTCCACCCAATGACAAGACCAAGGTAGCTATTGTAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTGAGACAGGATATCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCGTCCGCACGTTAAACAAAAGCAATTGTTACGCTTGTGCGCACGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAGGGTGGTCCTCCAGTCGACCGGGCATGGGCTGTATGGTAGCTCTTTTCCAGGATTCCACAGCCTGGGGTAACGAGTCGTGCCAAGCTCTCTCTCTGCTATATCCCGAAGTCCGACACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCGACGCTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTAAAGGGATGCAGTGAGCTTAAGACTTTCCAAGAGCTTACCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGACACTCTGCCAAGTAACTGGAGCGGCACTTGTGCTCTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGAAAGGAAAAATACGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTTTATTTAGACGCAATTGGAGTCCCACGGGGAGTACCAGATNAATTTAAAGCCCGAGATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAATTAATAAAAATGTAGATTGGATAAATTACATCTATTACAACCAACAGCGGTTTATTAACTACACTAGAGATGCTGTTAAAGGAATAGCTGAGCAATTAGGGCCTACTAGCCAGATGGCTTGGGAAAATAGAATAGCCCTAGACATGATATTAGCNGAAAAAGGNGGAGTTTGCGTCATGATTAAAACTCAATGTTGTACCTTCATCCCAAACAACACCGCCCCCGATGGAAGTATAACAAAGGCNTTGCAAGGNCTNACCGCTCTATCCAATGAGTTAGCCAAAAACTCNGGGGTAAATGACCCCTTTACAGGATGGCTAGAAAAGTGGTTCGGTAAATGGAAAGGAATCATAGCCTCAATTCTTACTTCCCTCGCAGCCGTAATAGGTGTACTCATTCTTGTCGGGTGCTGTGTCATACCATGCATCCGTGGGCTGGTGCAAAGGCTCATAGAAACGGCACTTACTAAAACCTCCCTTAGCTNTCCTCCACCTTATTCAGANAAGCTTCTTCTTTTAGAGGATCAAGCAGAACAACNAAGCCAAGACATGTTAAAAAAGTTTGAAGAGAAAGCTGTAAGAAAANTGCAAGAGGAGGAAAT
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
HERVIP10F | SGR5 | 1892 | 1904 | + | 18.03 | ACAAAAGACAAAA |
HERVIP10F | KLF12 | 72 | 80 | - | 18.01 | GGGGCGGGG |
HERVIP10F | KLF10 | 72 | 80 | - | 17.88 | GGGGCGGGG |
HERVIP10F | SP1 | 72 | 80 | - | 17.80 | GGGGCGGGG |
HERVIP10F | ZNF816 | 2793 | 2807 | - | 17.57 | AGGGGACATACAGGG |
HERVIP10F | SP2 | 72 | 80 | - | 17.56 | GGGGCGGGG |
HERVIP10F | SP4 | 72 | 80 | - | 17.54 | GGGGCGGGG |
HERVIP10F | MYB116 | 2532 | 2541 | - | 17.49 | TGCCTAACTT |
HERVIP10F | ZIC4 | 287 | 300 | - | 17.35 | GCCCCCCCGCGGCT |
HERVIP10F | CG3065 | 71 | 81 | + | 17.29 | GCCCCGCCCCC |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.