HERVIP10F
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000186 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Simiiformes |
Length | 7737 |
Kimura value | 13.62 |
Tau index | 0.9514 |
Description | Internal region of ERV1 endogenous retrovirus, HERVIP10F subfamily |
Comment | HERVIP10F is flanked by LTR10F long terminal repeats; these have 5 bp TSDs. HERVIP10F was an active endogenous retrovirus ~30 Myr ago. There are about 20 copies of the HERVIP10F internal sequence present in the human genome (not the same as a simply tally of fragmented hits). Three ORFs are found in the HERVIP10F consensus. ORF1 encodes gagP71A (~504-2216), a 571aa gag-like protein, closely related to gag proteins encoded by the C-type leukemia retroviruses. ORF2 encodes polIP10F (~2220-5720), an 1167aa polyprotein composed of protease (140aa), reverse transcriptase (517aa), RNase H (140aa) and endonuclease (270aa) domains, respectively. ORF3 encodes envIP10F (~5621-7735), a 705aa and envelope protein most similar to those from leukemia retroviruses. |
Sequence |
ATTTGGGGGCTCGTCCGGGATTACATTCCCCTCCGGGGGCGGTCTCCGGTCCTCTCTCGTGAGGAGGCGCGCCCCGCCCCCTTGTGGCGGCCTCAGGGGTGAGAAATCGGGACCCACCCGGTGCGAGGAATAACCCGGGCTCTCAGCAACGCGGAAAGAAACTGGCCGGCAACCTAGCTTAAAGGATCCTCACATACCGCGGCGACGACTCTGTGCACAGACCAAGGAAGGAGAAGCCGCGGGAGCCGGTAAAGTATTTCCTTGGTGGTCGGGACCAAGGTAAGAAAGCCGCGGGGGGGCGGTGAAGTACTCCTTGGTCGGGGTGGCTTAGAGGTTAAAAAGAGGCGAGANATCCCCANTGGNGGGGGATTGAGCCTCACACAAACCTCCAGTAGTAGAAAAGGCAAGAAATTTCCAGTGGGGGAAATTGAGCCTCACCCCAAAAGGCGAGAAATTTCCAGTAAGGGAAATTGAGCCTCGAACCTTACCCCAAAACCATCAAGATGGGAAATACCCCAAGCAAGACAGGGAGCAAGGGGGATAAAGATGGTAACAAAGATATCCCCCCGGATAGCCCCCTAGGTCTCATGCTAAAACACTGGAAGGATAATGAAAGGACTAAACATAGGAAAAAGCAACAAATGATAAAATATTGCTGTTTTATTTGGACTCAGGGACCCATCCTCAAACCCTCAATCTTCTGGCCAAAGTTTGGGTCGAATGAGGATGTAATGTGTCAGCTTCTAATCCGATATGTTAATGATAAAAGTCCAGTGTCTCAAGAAGAACTAGGCTATGCCCTTTGTTGGAGGCAAGGACCTGCCCTCCTTTTTCCCTTAAAAACAAATAGGGAAGAACCCAATCTGGCACCTCAAAATGAAAAGTCAGAGGAGCCAGCTCTCATGCCTAAAGACTCCAGCGCATGGGATCCCCTAGACCATCTTCCCCCGCTCAGTGTCCCCAATCTTTCCCCTCAGACAGCCGCTGCCGCCTCAGATCCCGTTCCAAATCCCCCCTCTACTCACGTTATCCCTCCTCCTTATAACCCTGACTCTTGGGAATTACCGTCCCACCAGCCTGTTCCCTCCCAACCTAAATACCCCTCTCTAAAAGGACTCCAGCGTGAGGTAGAACAATGTAAAAAAGATATTCAGAATTTCCCATTTCCCTCCGTACCTAAGGGGTCAGCCCCGACCCTCTTCCCTTTGAAAGAGGTACCACAAGGAGGGGGGGCGNNCGNCATTGGCTTTGTAAATGCTCCCTTAACCAGTTCAGAAGTCCGGAATTTTAAAAAGGAGCTTAAGCCGCTACTAGATGACCCTTACGGAGTGGCAGACCAAATTGACCAATTCTTAGGACCTCAGTTATACACTTGGGTCGAGTTAATGTCCATCTTGGGCATCCTCTTTTCAGGGGAAGAAAGGAGTATGATTCGTAGGGCTGCTATGGTAGTTTGGGAACGTGAGCACCCTCCCGGTCAAAACGTTCCTACCGCGGACCAGAAATTCCCCGCCCGAGACCCCCGGTGGGACAATAACAACGCAGNTCACCGGGAAAATATGCAGGACCTAAGGGAGATGATAATAAAAGGAATTCGGGAATCAGTACCCCGAACCCAAAATCTTTCTAAAGCATTTGATATACAACAGGAAAAGGATGAAGGGCCTATGAGATTCCTAGACAGACTGAGGGAGCAAATGAGGCAATATGCAGGCCTCGATTTGGANGATCCCCTTGGGCAAGGAATGTTAAAACTCCANTTTGTCACTAAAAGTTGGCCAGACATTTCAAAAAAGTTACAAAAGATAGANAATTGGGAAGACCGNCCCCTAAGTGAGCTTCTCAGGGAAGCTCAGAAAGTATACGTGAGAAGGGACGAAGAAAAACAGAAACAAAAGACAAAACTTATGTTNTCCACCTTCCAACAGATGGCTCCAAACCCANGTACTTCTAAACAGAGCTTCCAGGGGGCCAGAAACTATAAAGGGTCCGAACCCTCCTTTAAAGGACCCNAGCCTCCATCTGGAGGACCAAGGCCCTCGTCTACCAGGCCCCCTAAAGAGTATGGGGGAGCAAGGTCAAAGAATCCCAGAACTGAGAGGGAGGAAGGACAAGATAGGTGCTACAGATGTGGAAGAACAGGCCACTTCAAGAGAGAATGTCCCGAACTAAGAAAGGAGAAAGAAGCCCTTCCACTCATGACTTTCGAGGAAGAATAGGGGGGTCAGGGGCTCTGTCTCTTTTATCTTGAGTCCCACCAGGAGCCCTTGATAAATTTGGAGGTGGGACCTAAACATGAGCTTATCACCTTTTTAGTCGATTCAGGGGCTGCTCGCTCCTCTGTTTGTTTCCCCCCATCTAATGTTGCCTNCTCCTCAGAGGAACTTTTAGTCTCCGGGGTAAAAGGGGAAGGATTTAAAGCAAAAATTTTAGAAAGCACAGAAGTTAGATACCAGGATCGATCAGCTCATATTCAGTTTTTGTTAATCCCTGAAGCAGGAACTAATTTACTAGGGAGGGATTTAATGTTAAAGTTAGGCATAGGCCTACAAGTCAGCCCAAGAGGATTCCTCACCTCATTAAACCTACTCACCACTGCAGATGAAAAATATATTAATCCTAATGTCTGGTCCAAAGAAGGAAACCGAGGGAAACTCCGAGTCCCTCCAATCCACATCAAGCTAAAAACCCCCGGGGAAGTAGTAAGGAGGAAGCAATACCCCATTCCCCTAGAGGGCAGGATAGGGTTGAAACCTATAATTGAAGGTCTTATTAAAGATGGGCTTCTCGAGCCCTGTATGTCCCCTTATAACACCCCAATACTGCCAGTCAAGAAATCAGACGGGTCATACCGGCTGGTACAGGACCTTAGAGCTATCAACCAAATAGTCCAGACTACCCACCCCGTTGTCCCCAATCCTTACACCATTCTCAGCAAGATTCCATATAATCATCAATGGTTTACTGTAATAGATTTGAAGGATGCTTTTTGGGCATGTCCCCTGGCTGAAGATAGCCGAGATATATTTGCTTTTGAGTGGGAGGATCCCCACTCAGGGCGGAAACAACAATATCGATGGACAGTCTTGCCCCAAGGGTTCACAGACTCCCCTAACCTTTTCGGTCAAATTTTAGAACAAGTATTAGAAAAAGTTGTCGTCCCAGAACAAATATGCCTGCTCCAGTACGTGGATGATATTCTTATATCTGGTGAAGATATAGAGAAGGTAGCTGGCTTCTCTACACATATTCTTAACCATCTGCAGTTCGAGGGGCTACGAGTCTCAAAAGGAAAGCTTCAGTATGTAGAGCCTGAAGTTAAATATTTAGGCCACTTAATAAGTGCAGGCAAGCGAAGAATAGGGCCTGAACGAGTTGAGGGAATCGTGTCCCTACCCTTGCCTCAAACTAAACAAGAACTCAGGAAATTTTTAGGGTTAGTCGGATACTGCCGCTTATGGATTGACTCATATGCACTAAACAGTAAACTNTTATATCAAAAACTTGCCCAGGAGAAGCCTGACCGTCTCCTGTGGACTTCTGAGGAAGTCGATCAGGTCGAGGAGCTGAAAGAAAGGCTCATAACTGCCCCTGTCTTAGCCTTACCCTCCCTAGAAAAGCCATTCCACCTTTTTGTCAATGTGGACAATGGGGTAGCTTTAGGAGTGCTNACTCAAGAACACGGAGGCCGCCGGCAGCCCGTGGCCTTCCTATCAAAAGTCTTAGACCCAGTNACCCGTGGATGGCCTCAATGCATCCAATCCGTCGCGGCTACGGCANTACTAGTCGAAGAAAGCAGAAAGTTAACCTTTGGAGGAAAATTGACNGTAAGCACGCCCCACCAAGTTAGAACTATCTTAAACCAGAAAGCAGGGAGGTGGCTTACTGACTCAAGAATCTTAAAGTATGAGGCTATTCTGTTAGAAAAAGATGATTTAACATTAACCACTGATAATTCGCTTAACCCAGCAGGTTTCCTAACAGGGGATCCAAATCTAAAGAGAGAGCACACATGTTTAGATTTAATTGATTACCATACAAAGGTCCGACCAGACCTAGGAGAAACTCCCTTCAGGACGGGACGACACTTATTTATAGATGGTTCCTCCCGGGTGATTGAGGGAAAAAGACACAATGGGTATTCAGTAATTGATGGAGAAACTCTCGNAGAAATAGAGTCAGGAAAATTGCCTAATAATTGGTCTGCCCAAACGTGTGAGCTGTTTGCACTCAGCCAAGCCTTAAAGTACTTACAGAACCAGGAAGGAACCATCTATACCGATTCTAAGTACGCCTTTGGAGTGGCTCATACATTTGGAAAAATTTGGACTGAACGAGGTCTCATTAATAGTAAAGGTCAAGACCTTGTTCACAAGGAGCTAATCACCCAAGTATTGAATAACCTTCAGTTGCCAGAAGAAATAGCTATTGTCCATGTCCCCGGACACCAGAAAAGCCTTTCTTTTGAAAGTCGAGGAAATAACCTAGCAGATCAGATAGCCAAACAGGCTGCCGTTTCTTCTGAAACGCCTATTTTTCACTTAACTCCTTACCTTCCTCCTCCTACCGTAATCCCCATTTTCTCTTCCACTGAAAAAGAGAAACTAATAAAAATAGGTGCTAAAGAGAATTCAGAAGGAAAATGGATATTGCCAGACCAGAGAGAAATGTTATCCAAACCCCTTATGAGGGAAATCTTGTCCCAACTGCATCAAGGGACCCACTGGGGGCCCCAAGCCATGTGTGACGCAGTTCTCAGAGTTTATGGGTGTATAGGAATTTATACCCTGGCCAAACAGGTTACAGACAGTTGCTTAGTATGTAAGAAAACTAATAAACAAACTATAAAAAGATTACCCCTTGGGGGAAGGAGTCCAGGCTTAAGGCCATTCCAAAGTATCCAGATTGATTACACAGAGATGCCTCCAATAGGTCGTCTAAAATATTTACTAGTGATAGTAGATCACCTCACTCACTGGGTCGAAGCTATTCCCTTTTCAAATGCGACGGCCAATAATGTAGTTAAGGCNTTAATTGAAAATATAGTACCCAGGTTTGGACTAATAGAAAACATTGACTCAGACAATGGAACCCATTTCACCGCACACGTCATTAAAAAGCTANCCCAAGCACTAGACATNAGATGGGAATACCATACTCCCTGGCACCCACCTTCATCAGGGAGAGTAGAAAGAATGAACCAGACTCTAAAGAACCACTTAACCAAATTAGTCTTAGAGACTCGGTTGCCATGGACCAAATGCCTTCCTATTGCCCTGTTGAGAATCCGAACTGCCCCNCGGAAAGATATTGGCCTNTCCCCTTATGAGATGCTCTATGGATTGCCTTATTTACACTCCACTGCTGACATTCCTACNTTTGAAACAAAAGATCAGTTTCTCAGAAATTATATACTTGGTCTATCTTCCACTTTCTCTTCCCTCAGAACTAAAGGTCTTTTAGCACAGGCGCCACCCCTGGAGTTCCCAGTACACCAACATCAGCCTGGGGATCACGTCCTCATCAAAAGCTGGAAAGAGGGAAAGCTCGAGCCGGCCTGGGAAGGACCTTACCTAGTGCTCCTAACTACCGAAACCGCAGTCCGGACAGCAGAAAGAGGATGGACCCATCACACCCGAGTCAAGAAAGCGCCGCCCCCTCCAGAGTCATGGGCCATNGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAGTTTAACTCTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATCGCTGACCACCTNGTTATTAACGTAACCAAGTCAATTTCGCCTCAAACTATTACATTTGATGCTTGCCTTGTTATACCCTGTGGAGACTTGCCAAGTCAAAGGCAGCTCTCCACTTCAGAAAAGTACCTCTGTCCTTCCTGGCTCTCCTCAGACTGGGCATTAGTGAATTGGGATCATTTAGTCTGGGAAGATTTCGATGAAGACCCCAGTGTCAACCGGGAGTCTTGCCCCCCCGACGCAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACCAAAGAGCAAGGATGGACTGCCCCAACCAGTANTTGTAATTTCCTAAAACCATACATTCATTTTACTAAAGGAACAGCCCCCCCCAACTGTCAGCTAAACCAGTGCAATCCAATACAGGTTATTATCTCGANCCCCCAAAGTTCTTCCCCTTCTCTAAGCCGGTTCCCTTCTTTAAGCCGNTTCTATGGTATGGGGGCTGAGGTTTCAGGAACGGACCCTATNGGATCCTTTGAAATGCGCTTCATTGCTCCCCCACCGCCTGCACCTCCCTCTAAGCCTTCTTCCAAAACCTCTCACAACGAAACCGTCGTTCCTCCTCCACCCAATGACAAGACCAAGGTAGCTATTGTAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTGAGACAGGATATCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCGTCCGCACGTTAAACAAAAGCAATTGTTACGCTTGTGCGCACGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAGGGTGGTCCTCCAGTCGACCGGGCATGGGCTGTATGGTAGCTCTTTTCCAGGATTCCACAGCCTGGGGTAACGAGTCGTGCCAAGCTCTCTCTCTGCTATATCCCGAAGTCCGACACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCGACGCTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTAAAGGGATGCAGTGAGCTTAAGACTTTCCAAGAGCTTACCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGACACTCTGCCAAGTAACTGGAGCGGCACTTGTGCTCTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGAAAGGAAAAATACGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTTTATTTAGACGCAATTGGAGTCCCACGGGGAGTACCAGATNAATTTAAAGCCCGAGATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAATTAATAAAAATGTAGATTGGATAAATTACATCTATTACAACCAACAGCGGTTTATTAACTACACTAGAGATGCTGTTAAAGGAATAGCTGAGCAATTAGGGCCTACTAGCCAGATGGCTTGGGAAAATAGAATAGCCCTAGACATGATATTAGCNGAAAAAGGNGGAGTTTGCGTCATGATTAAAACTCAATGTTGTACCTTCATCCCAAACAACACCGCCCCCGATGGAAGTATAACAAAGGCNTTGCAAGGNCTNACCGCTCTATCCAATGAGTTAGCCAAAAACTCNGGGGTAAATGACCCCTTTACAGGATGGCTAGAAAAGTGGTTCGGTAAATGGAAAGGAATCATAGCCTCAATTCTTACTTCCCTCGCAGCCGTAATAGGTGTACTCATTCTTGTCGGGTGCTGTGTCATACCATGCATCCGTGGGCTGGTGCAAAGGCTCATAGAAACGGCACTTACTAAAACCTCCCTTAGCTNTCCTCCACCTTATTCAGANAAGCTTCTTCTTTTAGAGGATCAAGCAGAACAACNAAGCCAAGACATGTTAAAAAAGTTTGAAGAGAAAGCTGTAAGAAAANTGCAAGAGGAGGAAAT
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
HERVIP10F | BPC5 | 5720 | 5749 | - | -41.43 | AGAGTAATAGAATAGATGAAAGAGAGTTAA |
HERVIP10F | BPC5 | 5413 | 5442 | - | -41.67 | GGGAAGAGAAAGTGGAAGATAGACCAAGTA |
HERVIP10F | BPC5 | 5407 | 5436 | - | -44.10 | AGAAAGTGGAAGATAGACCAAGTATATAAT |
HERVIP10F | BPC5 | 4112 | 4141 | + | -44.68 | GGTGATTGAGGGAAAAAGACACAATGGGTA |
HERVIP10F | BPC5 | 4110 | 4139 | + | -44.93 | CGGGTGATTGAGGGAAAAAGACACAATGGG |
HERVIP10F | BPC5 | 5722 | 5751 | - | -45.63 | AAAGAGTAATAGAATAGATGAAAGAGAGTT |
HERVIP10F | BPC5 | 1683 | 1712 | + | -46.26 | AGACTGAGGGAGCAAATGAGGCAATATGCA |
HERVIP10F | BPC5 | 521 | 550 | + | -47.82 | CAAGACAGGGAGCAAGGGGGATAAAGATGG |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.