Harlequin
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000017 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Catarrhini |
Length | 6896 |
Kimura value | 6.23 |
Tau index | 0.9729 |
Description | Internal region of ERV1 endogenous retrovirus, Harlequin subfamily |
Comment | Harlequin is an internal part of an endogenous retrovirus-like element flanked by LTR2s. Its consensus sequence has been reconstructed based on eight full-length copies; they are ~92% similar to the consensus sequence. One copy has been inserted into Alu-Y repeat. Harlequin was built up from several different retroviruses including HERVE, HERVI, HERV17 and MER4I-group sequences. We suggest that shuffling of non- homologous endogenous retroviral sequences may generate mosaic retrovirus-like elements. Presumably, the shuffling occurs in virions where RNA genomes of different retroviruses could be packed in together. Self-catalytic RNA recombinations or jumping of reverse transcriptase between non-homologous RNA molecules may induce the shuffling. |
Sequence |
TTTCTTGGTTCCCTGACCGGGAAGCGAGGTGATTAACGGACGGTCGAGGCAGCCCCTTAGGCGGCTTAGGCCTGCCCTGTGGAGCATCCCTGCGGGGGACTCCGGCCAGCCTGAGCGACGCGGATCCTGAGAGCGCTCCCGGGTAGGCAATTGCCCCGGTGGAACGCCTCGCCAGAGCAGCGCGTGGCAGGCCCCCGCGGAGGATCAACGCAGTGGCTGAACACCGGGAAGGAACTGGCACTTGGAGTCCGGACATCTGAAACTTGGTAAGACTGGTCTTTGGAACTTGCCCCACTCCATTTGAGTGGAAGCGTGGCCTGATCACCCACGGCGTGCCTGTACCGGCACTTTGGTTTTTGTTTTTGACTTGACTTGGATTGCTTGATACTTTGGTTTTGGTTTTGACCTGGCTTGGATTTCTGGATACTCTGATTTTGGTTTTGATTTTGGTTTGGTGTAAACTGCAAAAGTGTGTGTGTGCCCTTTTTACCCGTTCTTTGTTTTGTGGTGTGCGTGTGGTGTGAGCGTGGTGTTTTGTCTCGAAGAAGCATGGGTCAGGCACAAAGTAAGCCCACCCCACTAGGAACTATGTTGAAAAATTTCAAGAAAGGATTTAAGGGAGACTACGGNGTTACTATGACACCAGGAAAACTTAGAACTTTGTGTGAAATAGACTGGCCAGCATTAGAGGTGGGTTGGCCATCAGAAGGAAGCCTGGACAGGTCCCTTGTTTCAAAGGTATGGCACAAGGTAACCTGTAAGCCAGGGCACCCAGACCAGTTCCCGTACATAGACACTTGGTTACAGCTGGTTTTAGACCCCCCGCCCCCACAGTGGTTGAGAGAACAGCAGCATAAGCGGCTGGCAGAGGCAAGGAAAGACCAGCAGAGAGANNGAGAGAGGAAAGAGACAGAGAGGAAAAGAGGCAAAGAGAGAGAGGAAGAGACAGAGAGGAAGAGACAGAGAGACAAAGAGGGAGTCAAGGAGAGAGAAAGAGAGAGGCAGAGAGAGAGGAAGAGACAGAGGCAAAAGGAAAGTCAAAGAGAGANAGAAAGTCAAAGAGAGAAAGAAAGAGAGAGATATACAAGTAGTTAAGAAAAAAAACAGTGTACCCTATTCCTTTAAAAGCCAAGGTAAATTTAAAACCTATAATTGATAATTGAAGGTATTCTCCGTAACCCTATAACACTCCAATACCACTTTGTTGTCAGTGTAAACAAGGGCGTATCCCGAAAGCACTGAGGCCTTCCTATCAAAAATCCTTAACCCAGTAACCCGCGGATGGCCCAAATGCATTCAATCTGTAGCGGCAACTGCTTTGCTAACAGAAAAAAGTAAAAAAAATAACTTTTAGAGGAAACCTCATTGTGAGCACACCTCACCAGTTCAGAAGTATCCTAAGGAAAAAAAAAAAAAAAAGGATGATTTAACATTAACCACTGAAAATTCCCTTAACCCAGCAGGTTTCCTAACAGGGGATCTAAATCTTAATTACCATACAAAGGTCCGACCAGACCTAGGAGGAACTCCCTTCAGGACAGGACGATAGATGGTTCCTCCCGGGTAATTGAAGAAAGAAAAAAAGCCATCTATACCAATTCTAAGTTAATTTGGACTAAACAAGGTCTTATTAATAGCAAAGGATAATTGAAATCCCAAACTTACAAGGTTTTCAACAAAAGTAAAGTTTGCTAAAAGTTAACAGTGTAACATGTATTATAGTAACTTCTAATCTTGTGGCCTTAGACAGTCTAGTCCACAGACATAAAGGAAGTTCGCTTTGGAAAAGAATGGTTATCATCTTCGAAAAAAAAAGAGAGGAAAGGGGGGGCAGAATTTATGTAAAAAGAGTGTTATATGGTAAATTCTTGTCCTGAAATAAATTAACTGGTTGTTTAAAGAAAGAAATGTTTGTAATAAGTCAGAAAGTTGAGGCATGTCGAAGAATTGTCTGCGAAAGTCGTGAAAGAAAAAAATGTTATAAAAAAAGAATTTATGCAAGAAATGTTGTATAATTTAAAAGTAATTAGGCCTCCTGAATGTAAAACTATTGAAAAAACAGTTTATGTGCAAGGTGTATAAGGAAAGTAAAATATACCTTTGGTAAAAGGATTATAAGGAGGCATAAGAATGTGGATTTTTACCTACATTAAAAGGTTAAAAAAATTATTGTTTTGAAGGTTTAAGCAAGTTTTAAAACGTTAATTGTAAAGAAAATTCTGTGTGTAAACATATTAGCTAAAGTTAAAGAGGTATCATCCAGTTTTTCTGTGAACTGGACATTAAAGTAAAAACGCAACGGGTTTTTCTTAAAGCACCAACCTGCTCTTTAACAAAAATTATAAAAGGTTAAAAAGAGTCTATAAAAATCTTACCTTATGGTCAAACATTAAAAATTGGATAAATATGTCTACAAGGTTTTATTAAAATTAAGTTTAACATTAATAACACACTAATATAAAGGTGAAATTTAGCTTATCTGGTATAAAAATCATACAAGAAGCATTGTTAAATATAAAATGGTGTTTGGCTTTCTTTGGTCTAAAAACTAATAAAAATAGGTGCTAAAGGAAATTTCTCAGTAAAAAGGCACCAAGGACTATAAAGTCCACTGCCGATGTCCCCACATTTAAAACAAAAGGTCAATTTCTTAGAAATTATATACTTGGTTTATCTTCCACTTTCCTTTCCCTCAAAACTAAAAGTCTTTTAGCACATGTACCACCCCTAGAATTTCCGGTAAACCAGCACCAGCCTGAAGATCACGTTCTCATCAAAGGGTGGAAAGAAGGAAAACTCGAGCCAGCCTGGGAAGGACCCTACCTTGTGCTGCTAACCACCGAGACTGCTGTTCGTACAGCGAAAAAGGGATGGACTCATCACACCCGAGTCAAGAAAGCGCCACCCCCTCCAGAGTCGTGGGCCATAGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAATTTAACTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATTGCTGACCATCTAGTTATTAACATAACCAAGTCAATTTCGCCTCAAACTATTGCATTTAATGCTTGCCTTGTTATACCCTGTGGGGACTTGCCAAGTCAAAGACAGCTCTCTACTTCAGAAAAGTACCTCTGTCCCTCCTGACTCTCCTCAGACTGGGCATTAGTAAATTAGGACCATTTAATCCGGGGAGATTTCGATAAAGACCCCAGTGTCAACCAGGAGTCTTGCCCCCCGATGTAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACTAAAGAGCAAGGATGGACTGCCCCAACCGGTTTTTGTAATTTCCTAAAACCATACATTCATTTTACTAGAGGATCATAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTAAGACAGGATACCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCATCCGCACGTTAAACAAAAGCAATTGTTATGCTTGTGCACATGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAAGGTGGTCCTCCAGTCGACCAGGCGTGGGCTGCATGGTAGCTCTTTTCCAGGATTCTACAGCCTGGAGTAATAAGTCGTGCCAAGCTCTCTCTGCTATATCCCGAAGTCCGGCACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCAACACTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTGAAGGGATGCAGTGAGCTTAAGAATTTTCAAGAGCTTATCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGGCACTCTGCCGAATAACTGGAGTGGCACTTGTGCTTTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGGGAGGAAAAATAAGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTCTATTTAGACGCAATTGGAGTCCCACGGGGAATACCAGATCAATTTAAAGCTTGAAATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAGTTAATAAAAATGTAGATTGGATAAACTACATCTATTACAACCAACAGCAACGAGCTTTTCATGAGTTAAAAGAAAAACTCATGTCGGCCCCAGCCCTGGGGCTACCTGACCTGACAAAACCCTTTACACTCTATGTGTCAGAAAGAGAAAAAATGGCAGTTGGAGTTTTAACCCAGACTGTGGGGCCCTGGCCAAGGCCAGTGGCCTATCTCTCAAAACAACTAGACGGGGTTTCCAAAGGCTGGCCCCCATGTCTAAGGGCCCTGGCAGCAACGGCCCTGTTAGCACAAGAAGCAGATAAACTAACCCTTGGGCAAAACCTGAATATAAAGGCCCCCCATGCTGTGGTAACTTTAATGAATACCAAAGGACATCATTGGCTAACGAATGCTAGATTAACCAAGTACCAAAGCTTGCTATGTGAAAATCCCCGCATAACCATTGAAGTTTGCAACACCCTAAACCCCGCCACCTTGCTCCCGGTATCAGAGAGCCCAGTTGAACATAACTGTGTAGAGGTGTTGGACTCAGTTTATTCTAGCGGGCCCAACCTCCGAGACCATCCTTGAACATCAGTAGACTGTGAGCTGTACGTGGACGGGAGCAGCTTCGCCAACCCCTGCAAAGTGACTCTGAAGAAGACGACAAGCCCTGCTCCAGTCACACCCGGAAGCTGACTGGTCCACGCACGGCCGAAGCATGAGAAAACTCATCGCGGGACTCATTTTCCTTAAAATTTGGACTTGTACAGTAAGGACTTCAACTGACCTTCCTCAGACTGAGGACTGTTCCCAGTGTATACATCAAGTCACTGAGGTAGGACAAAAGGTTGCTACGGTCCTATTATTTTATGGTTATTATAAGTGTACCGGAACTCTAAAAAGAACTTGTTTGTATAATGTTATTCTATACAAGGTATGTAGCCCAGGAAATGACCAACCTGATGTGTGTTATGACCCATCTGAGCCTCCCATGACCACAGTTTTTAAAATAAGATTAAGGACTGAGGACTGGTGGGGGCTCATAAACGATACGAGTAAAGTGTTAGCCAAAACAGAAGAAAAAGGGGTGCCCAAACAAGTCACCTTGAAATTTGATGCCTGTGCTGTCATTAATAGTAATAAGTNTAGGAATAGGATGTGGTTCTCTTAATTNNAGGAAAGAGGCTATATGGCAGAAAATAAGTACATTTGTCATGAATTAGGACTGTGTGGAAATAAATGTGGATACTGGTCTTGTGTCATTTAGGCTACTTGGATAAAAAATGAAAAGGATCCTGTCCACCTTCAGAAAGGGAAAAGTGGCCCTTCCTGTACCAGTGGTCAGTGTAACCCCTTAGAACTAGTAATAACCAACCCCCTTGATCCTCGCTGGAAAAAAGGGGAGCGTGTAACCCTAGGAATCGATGGGGCCGGACTGGATCCTCGAGTAAATATCGTGGTTCGAGGAGAAGTTTATAAACGCTCTCCTGAGCCAGTATTTCAAACCTTCTATGATGAACTGAATGTGCCAGTACCAGAAATTCCAGGAAAAACAAGAAATTTGTTTTTGCAATTAGCCGAGCATGTAGCCCAGTCTCTCAATGTCACTTCATGTTATGTATGTGGAGGAACTGTAATGGGAGATCAATGGCCATGGGAAGCCCGAGAATTAGTACCTACAGACCCAGTTCCTGATGAATTCCCGGCTCAAAAGAATCACCCTGATAACTTCTGGGTCCTAAAAGCCTCAATCATTAGACAATACTGTATAGCAAGAGTGGGGAAGGACTTCACCCTTCCCGTGGGAAGACTCAGCTGCCTTGGGCAAAAACTGTATAATAGTACTACAAAAACAGCCACCTGGTGGAGTTCAAACCACACTAAGAAAAATCCATTTAGTAAATTCCCAAAGTTGCAAACCGTGTGGACCCACCCGGAGTCCCACCGGGACTGGACAGCCCCCACTGGATTATACTGGATATGTGGGCATAGAGCTTACGCCAAATTACCCGACCAGTGGGCAGGTAGTTGTGTTATTGGCACTATTAAACCATCTTTCTTCCTACTGCCCATAAAGACAGGCGAACTCCTGGGCTTCCCTGTCTATGCTTCCCGCGAAAAGAGAAGCATAGCTATAGGAAATTGGAAAGATGATAAATGGCCCCCTGAGAGAATCATACAATATTATGGGCCTGCTACTTGGGCACAAGACGGCTCGTGGGGATACCGGACCCCCATTTACATGCTCAACCGAATCATACGGTTACAAGCTGTCTTAGAAATAATCACTAATAAGACCGGCAGAGCCTTGACTATTCTGGCCCGGCAAGAAACTCAGATGAGAAATGCTATCTATCAAAATAGATTGGCTCTCGACTACTTGCTAGCAGCTGAAGGAGGGGTCTGTAGGAAATTTAACCTTACTAATTGCTGCCTACACATAGATGATCAAGGGCAAGTAGTTGAAGACATAGTTAGAGATATGACAAAACTGGCACATGTGCCCGTGCAAGTGTGGCATGGATTTGATCCTGGGGCCATGTTTGGAAAATGGTTCCCAGCGCTAGGAGGATTTAAAACTCTTATAATAGGAGTTATAATAGTAATAGGAACCTGCTTACTGCTCCCTTGTTTGCTACCTGTACTTCTTCAAATGATAAAAAGCTTCATCGCTACCTTAGTTCACCAAAATGCTTCAGCACAAGTGTACTATATGAATCACTATCGATCTGTCTTGCAAGAAGACATGGGTAGTGAGAATGAAAGTGAGAACTCCCACTANTGAGTGAGATTCTCAAAGGGGGGGAA
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
Harlequin | BPC5 | 896 | 925 | + | -42.65 | GAGAGAGGAAAGAGACAGAGAGGAAAAGAG |
Harlequin | BPC5 | 1805 | 1834 | + | -43.94 | CGAAAAAAAAAGAGAGGAAAGGGGGGGCAG |
Harlequin | BPC5 | 1070 | 1099 | + | -45.22 | AAAGAGAGAGATATACAAGTAGTTAAGAAA |
Harlequin | BPC5 | 898 | 927 | + | -45.65 | GAGAGGAAAGAGACAGAGAGGAAAAGAGGC |
Harlequin | BPC5 | 2990 | 3019 | - | -45.68 | CAATAGAGCGAGGAAAGAAGAAAGAGTAAT |
Harlequin | BPC5 | 2972 | 3001 | - | -46.09 | AGAAAGAGTAATAGAATAGATGAAAGAGTT |
Harlequin | BPC5 | 2994 | 3023 | - | -47.79 | TCAGCAATAGAGCGAGGAAAGAAGAAAGAG |
Harlequin | BPC5 | 1809 | 1838 | + | -48.24 | AAAAAAAGAGAGGAAAGGGGGGGCAGAATT |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.