HERVIP10F

Basic information Differential Expression Stage analysis Survival analysis Correlation analysis

DF ID DF0000186
TE superfamily ERV1
TE class LTR
Species Simiiformes
Length 7737
Kimura value 13.62
Tau index 0.9514
Description Internal region of ERV1 endogenous retrovirus, HERVIP10F subfamily
Comment HERVIP10F is flanked by LTR10F long terminal repeats; these have 5 bp TSDs. HERVIP10F was an active endogenous retrovirus ~30 Myr ago. There are about 20 copies of the HERVIP10F internal sequence present in the human genome (not the same as a simply tally of fragmented hits). Three ORFs are found in the HERVIP10F consensus. ORF1 encodes gagP71A (~504-2216), a 571aa gag-like protein, closely related to gag proteins encoded by the C-type leukemia retroviruses. ORF2 encodes polIP10F (~2220-5720), an 1167aa polyprotein composed of protease (140aa), reverse transcriptase (517aa), RNase H (140aa) and endonuclease (270aa) domains, respectively. ORF3 encodes envIP10F (~5621-7735), a 705aa and envelope protein most similar to those from leukemia retroviruses.
Sequence
ATTTGGGGGCTCGTCCGGGATTACATTCCCCTCCGGGGGCGGTCTCCGGTCCTCTCTCGTGAGGAGGCGCGCCCCGCCCCCTTGTGGCGGCCTCAGGGGTGAGAAATCGGGACCCACCCGGTGCGAGGAATAACCCGGGCTCTCAGCAACGCGGAAAGAAACTGGCCGGCAACCTAGCTTAAAGGATCCTCACATACCGCGGCGACGACTCTGTGCACAGACCAAGGAAGGAGAAGCCGCGGGAGCCGGTAAAGTATTTCCTTGGTGGTCGGGACCAAGGTAAGAAAGCCGCGGGGGGGCGGTGAAGTACTCCTTGGTCGGGGTGGCTTAGAGGTTAAAAAGAGGCGAGANATCCCCANTGGNGGGGGATTGAGCCTCACACAAACCTCCAGTAGTAGAAAAGGCAAGAAATTTCCAGTGGGGGAAATTGAGCCTCACCCCAAAAGGCGAGAAATTTCCAGTAAGGGAAATTGAGCCTCGAACCTTACCCCAAAACCATCAAGATGGGAAATACCCCAAGCAAGACAGGGAGCAAGGGGGATAAAGATGGTAACAAAGATATCCCCCCGGATAGCCCCCTAGGTCTCATGCTAAAACACTGGAAGGATAATGAAAGGACTAAACATAGGAAAAAGCAACAAATGATAAAATATTGCTGTTTTATTTGGACTCAGGGACCCATCCTCAAACCCTCAATCTTCTGGCCAAAGTTTGGGTCGAATGAGGATGTAATGTGTCAGCTTCTAATCCGATATGTTAATGATAAAAGTCCAGTGTCTCAAGAAGAACTAGGCTATGCCCTTTGTTGGAGGCAAGGACCTGCCCTCCTTTTTCCCTTAAAAACAAATAGGGAAGAACCCAATCTGGCACCTCAAAATGAAAAGTCAGAGGAGCCAGCTCTCATGCCTAAAGACTCCAGCGCATGGGATCCCCTAGACCATCTTCCCCCGCTCAGTGTCCCCAATCTTTCCCCTCAGACAGCCGCTGCCGCCTCAGATCCCGTTCCAAATCCCCCCTCTACTCACGTTATCCCTCCTCCTTATAACCCTGACTCTTGGGAATTACCGTCCCACCAGCCTGTTCCCTCCCAACCTAAATACCCCTCTCTAAAAGGACTCCAGCGTGAGGTAGAACAATGTAAAAAAGATATTCAGAATTTCCCATTTCCCTCCGTACCTAAGGGGTCAGCCCCGACCCTCTTCCCTTTGAAAGAGGTACCACAAGGAGGGGGGGCGNNCGNCATTGGCTTTGTAAATGCTCCCTTAACCAGTTCAGAAGTCCGGAATTTTAAAAAGGAGCTTAAGCCGCTACTAGATGACCCTTACGGAGTGGCAGACCAAATTGACCAATTCTTAGGACCTCAGTTATACACTTGGGTCGAGTTAATGTCCATCTTGGGCATCCTCTTTTCAGGGGAAGAAAGGAGTATGATTCGTAGGGCTGCTATGGTAGTTTGGGAACGTGAGCACCCTCCCGGTCAAAACGTTCCTACCGCGGACCAGAAATTCCCCGCCCGAGACCCCCGGTGGGACAATAACAACGCAGNTCACCGGGAAAATATGCAGGACCTAAGGGAGATGATAATAAAAGGAATTCGGGAATCAGTACCCCGAACCCAAAATCTTTCTAAAGCATTTGATATACAACAGGAAAAGGATGAAGGGCCTATGAGATTCCTAGACAGACTGAGGGAGCAAATGAGGCAATATGCAGGCCTCGATTTGGANGATCCCCTTGGGCAAGGAATGTTAAAACTCCANTTTGTCACTAAAAGTTGGCCAGACATTTCAAAAAAGTTACAAAAGATAGANAATTGGGAAGACCGNCCCCTAAGTGAGCTTCTCAGGGAAGCTCAGAAAGTATACGTGAGAAGGGACGAAGAAAAACAGAAACAAAAGACAAAACTTATGTTNTCCACCTTCCAACAGATGGCTCCAAACCCANGTACTTCTAAACAGAGCTTCCAGGGGGCCAGAAACTATAAAGGGTCCGAACCCTCCTTTAAAGGACCCNAGCCTCCATCTGGAGGACCAAGGCCCTCGTCTACCAGGCCCCCTAAAGAGTATGGGGGAGCAAGGTCAAAGAATCCCAGAACTGAGAGGGAGGAAGGACAAGATAGGTGCTACAGATGTGGAAGAACAGGCCACTTCAAGAGAGAATGTCCCGAACTAAGAAAGGAGAAAGAAGCCCTTCCACTCATGACTTTCGAGGAAGAATAGGGGGGTCAGGGGCTCTGTCTCTTTTATCTTGAGTCCCACCAGGAGCCCTTGATAAATTTGGAGGTGGGACCTAAACATGAGCTTATCACCTTTTTAGTCGATTCAGGGGCTGCTCGCTCCTCTGTTTGTTTCCCCCCATCTAATGTTGCCTNCTCCTCAGAGGAACTTTTAGTCTCCGGGGTAAAAGGGGAAGGATTTAAAGCAAAAATTTTAGAAAGCACAGAAGTTAGATACCAGGATCGATCAGCTCATATTCAGTTTTTGTTAATCCCTGAAGCAGGAACTAATTTACTAGGGAGGGATTTAATGTTAAAGTTAGGCATAGGCCTACAAGTCAGCCCAAGAGGATTCCTCACCTCATTAAACCTACTCACCACTGCAGATGAAAAATATATTAATCCTAATGTCTGGTCCAAAGAAGGAAACCGAGGGAAACTCCGAGTCCCTCCAATCCACATCAAGCTAAAAACCCCCGGGGAAGTAGTAAGGAGGAAGCAATACCCCATTCCCCTAGAGGGCAGGATAGGGTTGAAACCTATAATTGAAGGTCTTATTAAAGATGGGCTTCTCGAGCCCTGTATGTCCCCTTATAACACCCCAATACTGCCAGTCAAGAAATCAGACGGGTCATACCGGCTGGTACAGGACCTTAGAGCTATCAACCAAATAGTCCAGACTACCCACCCCGTTGTCCCCAATCCTTACACCATTCTCAGCAAGATTCCATATAATCATCAATGGTTTACTGTAATAGATTTGAAGGATGCTTTTTGGGCATGTCCCCTGGCTGAAGATAGCCGAGATATATTTGCTTTTGAGTGGGAGGATCCCCACTCAGGGCGGAAACAACAATATCGATGGACAGTCTTGCCCCAAGGGTTCACAGACTCCCCTAACCTTTTCGGTCAAATTTTAGAACAAGTATTAGAAAAAGTTGTCGTCCCAGAACAAATATGCCTGCTCCAGTACGTGGATGATATTCTTATATCTGGTGAAGATATAGAGAAGGTAGCTGGCTTCTCTACACATATTCTTAACCATCTGCAGTTCGAGGGGCTACGAGTCTCAAAAGGAAAGCTTCAGTATGTAGAGCCTGAAGTTAAATATTTAGGCCACTTAATAAGTGCAGGCAAGCGAAGAATAGGGCCTGAACGAGTTGAGGGAATCGTGTCCCTACCCTTGCCTCAAACTAAACAAGAACTCAGGAAATTTTTAGGGTTAGTCGGATACTGCCGCTTATGGATTGACTCATATGCACTAAACAGTAAACTNTTATATCAAAAACTTGCCCAGGAGAAGCCTGACCGTCTCCTGTGGACTTCTGAGGAAGTCGATCAGGTCGAGGAGCTGAAAGAAAGGCTCATAACTGCCCCTGTCTTAGCCTTACCCTCCCTAGAAAAGCCATTCCACCTTTTTGTCAATGTGGACAATGGGGTAGCTTTAGGAGTGCTNACTCAAGAACACGGAGGCCGCCGGCAGCCCGTGGCCTTCCTATCAAAAGTCTTAGACCCAGTNACCCGTGGATGGCCTCAATGCATCCAATCCGTCGCGGCTACGGCANTACTAGTCGAAGAAAGCAGAAAGTTAACCTTTGGAGGAAAATTGACNGTAAGCACGCCCCACCAAGTTAGAACTATCTTAAACCAGAAAGCAGGGAGGTGGCTTACTGACTCAAGAATCTTAAAGTATGAGGCTATTCTGTTAGAAAAAGATGATTTAACATTAACCACTGATAATTCGCTTAACCCAGCAGGTTTCCTAACAGGGGATCCAAATCTAAAGAGAGAGCACACATGTTTAGATTTAATTGATTACCATACAAAGGTCCGACCAGACCTAGGAGAAACTCCCTTCAGGACGGGACGACACTTATTTATAGATGGTTCCTCCCGGGTGATTGAGGGAAAAAGACACAATGGGTATTCAGTAATTGATGGAGAAACTCTCGNAGAAATAGAGTCAGGAAAATTGCCTAATAATTGGTCTGCCCAAACGTGTGAGCTGTTTGCACTCAGCCAAGCCTTAAAGTACTTACAGAACCAGGAAGGAACCATCTATACCGATTCTAAGTACGCCTTTGGAGTGGCTCATACATTTGGAAAAATTTGGACTGAACGAGGTCTCATTAATAGTAAAGGTCAAGACCTTGTTCACAAGGAGCTAATCACCCAAGTATTGAATAACCTTCAGTTGCCAGAAGAAATAGCTATTGTCCATGTCCCCGGACACCAGAAAAGCCTTTCTTTTGAAAGTCGAGGAAATAACCTAGCAGATCAGATAGCCAAACAGGCTGCCGTTTCTTCTGAAACGCCTATTTTTCACTTAACTCCTTACCTTCCTCCTCCTACCGTAATCCCCATTTTCTCTTCCACTGAAAAAGAGAAACTAATAAAAATAGGTGCTAAAGAGAATTCAGAAGGAAAATGGATATTGCCAGACCAGAGAGAAATGTTATCCAAACCCCTTATGAGGGAAATCTTGTCCCAACTGCATCAAGGGACCCACTGGGGGCCCCAAGCCATGTGTGACGCAGTTCTCAGAGTTTATGGGTGTATAGGAATTTATACCCTGGCCAAACAGGTTACAGACAGTTGCTTAGTATGTAAGAAAACTAATAAACAAACTATAAAAAGATTACCCCTTGGGGGAAGGAGTCCAGGCTTAAGGCCATTCCAAAGTATCCAGATTGATTACACAGAGATGCCTCCAATAGGTCGTCTAAAATATTTACTAGTGATAGTAGATCACCTCACTCACTGGGTCGAAGCTATTCCCTTTTCAAATGCGACGGCCAATAATGTAGTTAAGGCNTTAATTGAAAATATAGTACCCAGGTTTGGACTAATAGAAAACATTGACTCAGACAATGGAACCCATTTCACCGCACACGTCATTAAAAAGCTANCCCAAGCACTAGACATNAGATGGGAATACCATACTCCCTGGCACCCACCTTCATCAGGGAGAGTAGAAAGAATGAACCAGACTCTAAAGAACCACTTAACCAAATTAGTCTTAGAGACTCGGTTGCCATGGACCAAATGCCTTCCTATTGCCCTGTTGAGAATCCGAACTGCCCCNCGGAAAGATATTGGCCTNTCCCCTTATGAGATGCTCTATGGATTGCCTTATTTACACTCCACTGCTGACATTCCTACNTTTGAAACAAAAGATCAGTTTCTCAGAAATTATATACTTGGTCTATCTTCCACTTTCTCTTCCCTCAGAACTAAAGGTCTTTTAGCACAGGCGCCACCCCTGGAGTTCCCAGTACACCAACATCAGCCTGGGGATCACGTCCTCATCAAAAGCTGGAAAGAGGGAAAGCTCGAGCCGGCCTGGGAAGGACCTTACCTAGTGCTCCTAACTACCGAAACCGCAGTCCGGACAGCAGAAAGAGGATGGACCCATCACACCCGAGTCAAGAAAGCGCCGCCCCCTCCAGAGTCATGGGCCATNGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAGTTTAACTCTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATCGCTGACCACCTNGTTATTAACGTAACCAAGTCAATTTCGCCTCAAACTATTACATTTGATGCTTGCCTTGTTATACCCTGTGGAGACTTGCCAAGTCAAAGGCAGCTCTCCACTTCAGAAAAGTACCTCTGTCCTTCCTGGCTCTCCTCAGACTGGGCATTAGTGAATTGGGATCATTTAGTCTGGGAAGATTTCGATGAAGACCCCAGTGTCAACCGGGAGTCTTGCCCCCCCGACGCAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACCAAAGAGCAAGGATGGACTGCCCCAACCAGTANTTGTAATTTCCTAAAACCATACATTCATTTTACTAAAGGAACAGCCCCCCCCAACTGTCAGCTAAACCAGTGCAATCCAATACAGGTTATTATCTCGANCCCCCAAAGTTCTTCCCCTTCTCTAAGCCGGTTCCCTTCTTTAAGCCGNTTCTATGGTATGGGGGCTGAGGTTTCAGGAACGGACCCTATNGGATCCTTTGAAATGCGCTTCATTGCTCCCCCACCGCCTGCACCTCCCTCTAAGCCTTCTTCCAAAACCTCTCACAACGAAACCGTCGTTCCTCCTCCACCCAATGACAAGACCAAGGTAGCTATTGTAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTGAGACAGGATATCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCGTCCGCACGTTAAACAAAAGCAATTGTTACGCTTGTGCGCACGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAGGGTGGTCCTCCAGTCGACCGGGCATGGGCTGTATGGTAGCTCTTTTCCAGGATTCCACAGCCTGGGGTAACGAGTCGTGCCAAGCTCTCTCTCTGCTATATCCCGAAGTCCGACACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCGACGCTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTAAAGGGATGCAGTGAGCTTAAGACTTTCCAAGAGCTTACCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGACACTCTGCCAAGTAACTGGAGCGGCACTTGTGCTCTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGAAAGGAAAAATACGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTTTATTTAGACGCAATTGGAGTCCCACGGGGAGTACCAGATNAATTTAAAGCCCGAGATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAATTAATAAAAATGTAGATTGGATAAATTACATCTATTACAACCAACAGCGGTTTATTAACTACACTAGAGATGCTGTTAAAGGAATAGCTGAGCAATTAGGGCCTACTAGCCAGATGGCTTGGGAAAATAGAATAGCCCTAGACATGATATTAGCNGAAAAAGGNGGAGTTTGCGTCATGATTAAAACTCAATGTTGTACCTTCATCCCAAACAACACCGCCCCCGATGGAAGTATAACAAAGGCNTTGCAAGGNCTNACCGCTCTATCCAATGAGTTAGCCAAAAACTCNGGGGTAAATGACCCCTTTACAGGATGGCTAGAAAAGTGGTTCGGTAAATGGAAAGGAATCATAGCCTCAATTCTTACTTCCCTCGCAGCCGTAATAGGTGTACTCATTCTTGTCGGGTGCTGTGTCATACCATGCATCCGTGGGCTGGTGCAAAGGCTCATAGAAACGGCACTTACTAAAACCTCCCTTAGCTNTCCTCCACCTTATTCAGANAAGCTTCTTCTTTTAGAGGATCAAGCAGAACAACNAAGCCAAGACATGTTAAAAAAGTTTGAAGAGAAAGCTGTAAGAAAANTGCAAGAGGAGGAAAT



TF motifs of the concenus sequence

Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.

TE_family TFBS Start End Strand Score Matched sequence
HERVIP10F CTCF 74 88 - 21.34 GCCACAAGGGGGCGG
HERVIP10F opa 289 300 - 19.96 GCCCCCCCGCGG
HERVIP10F Spps 71 81 + 19.34 GCCCCGCCCCC
HERVIP10F PATZ1 71 81 - 19.16 GGGGGCGGGGC
HERVIP10F DREB2G 6305 6318 - 19.00 GGTGCAGGCGGTGG
HERVIP10F THI2 5333 5347 - 18.92 GGCAATCCATAGAGC
HERVIP10F sug 289 300 - 18.59 GCCCCCCCGCGG
HERVIP10F ZNF281 72 81 - 18.37 GGGGGCGGGG
HERVIP10F KLF16 71 81 + 18.24 GCCCCGCCCCC
HERVIP10F SP3 71 81 + 18.17 GCCCCGCCCCC


TFBS enrichment in GRCh38

Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.




GTEx

The promoter activity across 46 body sites from The Genotype-Tissue Expression (GTEx) project.




TCGA

The promoter activity across 33 cancer types from The Cancer Genome Atlas (TCGA).