Harlequin
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000017 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Catarrhini |
Length | 6896 |
Kimura value | 6.23 |
Tau index | 0.9729 |
Description | Internal region of ERV1 endogenous retrovirus, Harlequin subfamily |
Comment | Harlequin is an internal part of an endogenous retrovirus-like element flanked by LTR2s. Its consensus sequence has been reconstructed based on eight full-length copies; they are ~92% similar to the consensus sequence. One copy has been inserted into Alu-Y repeat. Harlequin was built up from several different retroviruses including HERVE, HERVI, HERV17 and MER4I-group sequences. We suggest that shuffling of non- homologous endogenous retroviral sequences may generate mosaic retrovirus-like elements. Presumably, the shuffling occurs in virions where RNA genomes of different retroviruses could be packed in together. Self-catalytic RNA recombinations or jumping of reverse transcriptase between non-homologous RNA molecules may induce the shuffling. |
Sequence |
TTTCTTGGTTCCCTGACCGGGAAGCGAGGTGATTAACGGACGGTCGAGGCAGCCCCTTAGGCGGCTTAGGCCTGCCCTGTGGAGCATCCCTGCGGGGGACTCCGGCCAGCCTGAGCGACGCGGATCCTGAGAGCGCTCCCGGGTAGGCAATTGCCCCGGTGGAACGCCTCGCCAGAGCAGCGCGTGGCAGGCCCCCGCGGAGGATCAACGCAGTGGCTGAACACCGGGAAGGAACTGGCACTTGGAGTCCGGACATCTGAAACTTGGTAAGACTGGTCTTTGGAACTTGCCCCACTCCATTTGAGTGGAAGCGTGGCCTGATCACCCACGGCGTGCCTGTACCGGCACTTTGGTTTTTGTTTTTGACTTGACTTGGATTGCTTGATACTTTGGTTTTGGTTTTGACCTGGCTTGGATTTCTGGATACTCTGATTTTGGTTTTGATTTTGGTTTGGTGTAAACTGCAAAAGTGTGTGTGTGCCCTTTTTACCCGTTCTTTGTTTTGTGGTGTGCGTGTGGTGTGAGCGTGGTGTTTTGTCTCGAAGAAGCATGGGTCAGGCACAAAGTAAGCCCACCCCACTAGGAACTATGTTGAAAAATTTCAAGAAAGGATTTAAGGGAGACTACGGNGTTACTATGACACCAGGAAAACTTAGAACTTTGTGTGAAATAGACTGGCCAGCATTAGAGGTGGGTTGGCCATCAGAAGGAAGCCTGGACAGGTCCCTTGTTTCAAAGGTATGGCACAAGGTAACCTGTAAGCCAGGGCACCCAGACCAGTTCCCGTACATAGACACTTGGTTACAGCTGGTTTTAGACCCCCCGCCCCCACAGTGGTTGAGAGAACAGCAGCATAAGCGGCTGGCAGAGGCAAGGAAAGACCAGCAGAGAGANNGAGAGAGGAAAGAGACAGAGAGGAAAAGAGGCAAAGAGAGAGAGGAAGAGACAGAGAGGAAGAGACAGAGAGACAAAGAGGGAGTCAAGGAGAGAGAAAGAGAGAGGCAGAGAGAGAGGAAGAGACAGAGGCAAAAGGAAAGTCAAAGAGAGANAGAAAGTCAAAGAGAGAAAGAAAGAGAGAGATATACAAGTAGTTAAGAAAAAAAACAGTGTACCCTATTCCTTTAAAAGCCAAGGTAAATTTAAAACCTATAATTGATAATTGAAGGTATTCTCCGTAACCCTATAACACTCCAATACCACTTTGTTGTCAGTGTAAACAAGGGCGTATCCCGAAAGCACTGAGGCCTTCCTATCAAAAATCCTTAACCCAGTAACCCGCGGATGGCCCAAATGCATTCAATCTGTAGCGGCAACTGCTTTGCTAACAGAAAAAAGTAAAAAAAATAACTTTTAGAGGAAACCTCATTGTGAGCACACCTCACCAGTTCAGAAGTATCCTAAGGAAAAAAAAAAAAAAAAGGATGATTTAACATTAACCACTGAAAATTCCCTTAACCCAGCAGGTTTCCTAACAGGGGATCTAAATCTTAATTACCATACAAAGGTCCGACCAGACCTAGGAGGAACTCCCTTCAGGACAGGACGATAGATGGTTCCTCCCGGGTAATTGAAGAAAGAAAAAAAGCCATCTATACCAATTCTAAGTTAATTTGGACTAAACAAGGTCTTATTAATAGCAAAGGATAATTGAAATCCCAAACTTACAAGGTTTTCAACAAAAGTAAAGTTTGCTAAAAGTTAACAGTGTAACATGTATTATAGTAACTTCTAATCTTGTGGCCTTAGACAGTCTAGTCCACAGACATAAAGGAAGTTCGCTTTGGAAAAGAATGGTTATCATCTTCGAAAAAAAAAGAGAGGAAAGGGGGGGCAGAATTTATGTAAAAAGAGTGTTATATGGTAAATTCTTGTCCTGAAATAAATTAACTGGTTGTTTAAAGAAAGAAATGTTTGTAATAAGTCAGAAAGTTGAGGCATGTCGAAGAATTGTCTGCGAAAGTCGTGAAAGAAAAAAATGTTATAAAAAAAGAATTTATGCAAGAAATGTTGTATAATTTAAAAGTAATTAGGCCTCCTGAATGTAAAACTATTGAAAAAACAGTTTATGTGCAAGGTGTATAAGGAAAGTAAAATATACCTTTGGTAAAAGGATTATAAGGAGGCATAAGAATGTGGATTTTTACCTACATTAAAAGGTTAAAAAAATTATTGTTTTGAAGGTTTAAGCAAGTTTTAAAACGTTAATTGTAAAGAAAATTCTGTGTGTAAACATATTAGCTAAAGTTAAAGAGGTATCATCCAGTTTTTCTGTGAACTGGACATTAAAGTAAAAACGCAACGGGTTTTTCTTAAAGCACCAACCTGCTCTTTAACAAAAATTATAAAAGGTTAAAAAGAGTCTATAAAAATCTTACCTTATGGTCAAACATTAAAAATTGGATAAATATGTCTACAAGGTTTTATTAAAATTAAGTTTAACATTAATAACACACTAATATAAAGGTGAAATTTAGCTTATCTGGTATAAAAATCATACAAGAAGCATTGTTAAATATAAAATGGTGTTTGGCTTTCTTTGGTCTAAAAACTAATAAAAATAGGTGCTAAAGGAAATTTCTCAGTAAAAAGGCACCAAGGACTATAAAGTCCACTGCCGATGTCCCCACATTTAAAACAAAAGGTCAATTTCTTAGAAATTATATACTTGGTTTATCTTCCACTTTCCTTTCCCTCAAAACTAAAAGTCTTTTAGCACATGTACCACCCCTAGAATTTCCGGTAAACCAGCACCAGCCTGAAGATCACGTTCTCATCAAAGGGTGGAAAGAAGGAAAACTCGAGCCAGCCTGGGAAGGACCCTACCTTGTGCTGCTAACCACCGAGACTGCTGTTCGTACAGCGAAAAAGGGATGGACTCATCACACCCGAGTCAAGAAAGCGCCACCCCCTCCAGAGTCGTGGGCCATAGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAATTTAACTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATTGCTGACCATCTAGTTATTAACATAACCAAGTCAATTTCGCCTCAAACTATTGCATTTAATGCTTGCCTTGTTATACCCTGTGGGGACTTGCCAAGTCAAAGACAGCTCTCTACTTCAGAAAAGTACCTCTGTCCCTCCTGACTCTCCTCAGACTGGGCATTAGTAAATTAGGACCATTTAATCCGGGGAGATTTCGATAAAGACCCCAGTGTCAACCAGGAGTCTTGCCCCCCGATGTAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACTAAAGAGCAAGGATGGACTGCCCCAACCGGTTTTTGTAATTTCCTAAAACCATACATTCATTTTACTAGAGGATCATAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTAAGACAGGATACCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCATCCGCACGTTAAACAAAAGCAATTGTTATGCTTGTGCACATGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAAGGTGGTCCTCCAGTCGACCAGGCGTGGGCTGCATGGTAGCTCTTTTCCAGGATTCTACAGCCTGGAGTAATAAGTCGTGCCAAGCTCTCTCTGCTATATCCCGAAGTCCGGCACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCAACACTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTGAAGGGATGCAGTGAGCTTAAGAATTTTCAAGAGCTTATCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGGCACTCTGCCGAATAACTGGAGTGGCACTTGTGCTTTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGGGAGGAAAAATAAGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTCTATTTAGACGCAATTGGAGTCCCACGGGGAATACCAGATCAATTTAAAGCTTGAAATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAGTTAATAAAAATGTAGATTGGATAAACTACATCTATTACAACCAACAGCAACGAGCTTTTCATGAGTTAAAAGAAAAACTCATGTCGGCCCCAGCCCTGGGGCTACCTGACCTGACAAAACCCTTTACACTCTATGTGTCAGAAAGAGAAAAAATGGCAGTTGGAGTTTTAACCCAGACTGTGGGGCCCTGGCCAAGGCCAGTGGCCTATCTCTCAAAACAACTAGACGGGGTTTCCAAAGGCTGGCCCCCATGTCTAAGGGCCCTGGCAGCAACGGCCCTGTTAGCACAAGAAGCAGATAAACTAACCCTTGGGCAAAACCTGAATATAAAGGCCCCCCATGCTGTGGTAACTTTAATGAATACCAAAGGACATCATTGGCTAACGAATGCTAGATTAACCAAGTACCAAAGCTTGCTATGTGAAAATCCCCGCATAACCATTGAAGTTTGCAACACCCTAAACCCCGCCACCTTGCTCCCGGTATCAGAGAGCCCAGTTGAACATAACTGTGTAGAGGTGTTGGACTCAGTTTATTCTAGCGGGCCCAACCTCCGAGACCATCCTTGAACATCAGTAGACTGTGAGCTGTACGTGGACGGGAGCAGCTTCGCCAACCCCTGCAAAGTGACTCTGAAGAAGACGACAAGCCCTGCTCCAGTCACACCCGGAAGCTGACTGGTCCACGCACGGCCGAAGCATGAGAAAACTCATCGCGGGACTCATTTTCCTTAAAATTTGGACTTGTACAGTAAGGACTTCAACTGACCTTCCTCAGACTGAGGACTGTTCCCAGTGTATACATCAAGTCACTGAGGTAGGACAAAAGGTTGCTACGGTCCTATTATTTTATGGTTATTATAAGTGTACCGGAACTCTAAAAAGAACTTGTTTGTATAATGTTATTCTATACAAGGTATGTAGCCCAGGAAATGACCAACCTGATGTGTGTTATGACCCATCTGAGCCTCCCATGACCACAGTTTTTAAAATAAGATTAAGGACTGAGGACTGGTGGGGGCTCATAAACGATACGAGTAAAGTGTTAGCCAAAACAGAAGAAAAAGGGGTGCCCAAACAAGTCACCTTGAAATTTGATGCCTGTGCTGTCATTAATAGTAATAAGTNTAGGAATAGGATGTGGTTCTCTTAATTNNAGGAAAGAGGCTATATGGCAGAAAATAAGTACATTTGTCATGAATTAGGACTGTGTGGAAATAAATGTGGATACTGGTCTTGTGTCATTTAGGCTACTTGGATAAAAAATGAAAAGGATCCTGTCCACCTTCAGAAAGGGAAAAGTGGCCCTTCCTGTACCAGTGGTCAGTGTAACCCCTTAGAACTAGTAATAACCAACCCCCTTGATCCTCGCTGGAAAAAAGGGGAGCGTGTAACCCTAGGAATCGATGGGGCCGGACTGGATCCTCGAGTAAATATCGTGGTTCGAGGAGAAGTTTATAAACGCTCTCCTGAGCCAGTATTTCAAACCTTCTATGATGAACTGAATGTGCCAGTACCAGAAATTCCAGGAAAAACAAGAAATTTGTTTTTGCAATTAGCCGAGCATGTAGCCCAGTCTCTCAATGTCACTTCATGTTATGTATGTGGAGGAACTGTAATGGGAGATCAATGGCCATGGGAAGCCCGAGAATTAGTACCTACAGACCCAGTTCCTGATGAATTCCCGGCTCAAAAGAATCACCCTGATAACTTCTGGGTCCTAAAAGCCTCAATCATTAGACAATACTGTATAGCAAGAGTGGGGAAGGACTTCACCCTTCCCGTGGGAAGACTCAGCTGCCTTGGGCAAAAACTGTATAATAGTACTACAAAAACAGCCACCTGGTGGAGTTCAAACCACACTAAGAAAAATCCATTTAGTAAATTCCCAAAGTTGCAAACCGTGTGGACCCACCCGGAGTCCCACCGGGACTGGACAGCCCCCACTGGATTATACTGGATATGTGGGCATAGAGCTTACGCCAAATTACCCGACCAGTGGGCAGGTAGTTGTGTTATTGGCACTATTAAACCATCTTTCTTCCTACTGCCCATAAAGACAGGCGAACTCCTGGGCTTCCCTGTCTATGCTTCCCGCGAAAAGAGAAGCATAGCTATAGGAAATTGGAAAGATGATAAATGGCCCCCTGAGAGAATCATACAATATTATGGGCCTGCTACTTGGGCACAAGACGGCTCGTGGGGATACCGGACCCCCATTTACATGCTCAACCGAATCATACGGTTACAAGCTGTCTTAGAAATAATCACTAATAAGACCGGCAGAGCCTTGACTATTCTGGCCCGGCAAGAAACTCAGATGAGAAATGCTATCTATCAAAATAGATTGGCTCTCGACTACTTGCTAGCAGCTGAAGGAGGGGTCTGTAGGAAATTTAACCTTACTAATTGCTGCCTACACATAGATGATCAAGGGCAAGTAGTTGAAGACATAGTTAGAGATATGACAAAACTGGCACATGTGCCCGTGCAAGTGTGGCATGGATTTGATCCTGGGGCCATGTTTGGAAAATGGTTCCCAGCGCTAGGAGGATTTAAAACTCTTATAATAGGAGTTATAATAGTAATAGGAACCTGCTTACTGCTCCCTTGTTTGCTACCTGTACTTCTTCAAATGATAAAAAGCTTCATCGCTACCTTAGTTCACCAAAATGCTTCAGCACAAGTGTACTATATGAATCACTATCGATCTGTCTTGCAAGAAGACATGGGTAGTGAGAATGAAAGTGAGAACTCCCACTANTGAGTGAGATTCTCAAAGGGGGGGAA
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
Harlequin | BPC6 | 991 | 1011 | - | 20.87 | CTCTCTCTGCCTCTCTCTTTC |
Harlequin | BPC5 | 996 | 1025 | + | 20.80 | AGAGAGGCAGAGAGAGAGGAAGAGACAGAG |
Harlequin | BPC1 | 945 | 968 | + | 20.73 | GACAGAGAGGAAGAGACAGAGAGA |
Harlequin | BPC1 | 925 | 948 | + | 20.64 | GGCAAAGAGAGAGAGGAAGAGACA |
Harlequin | BPC1 | 1053 | 1076 | + | 20.53 | AAGTCAAAGAGAGAAAGAAAGAGA |
Harlequin | DOF5.1 | 1330 | 1348 | + | 20.41 | AAAAAGTAAAAAAAATAAC |
Harlequin | BPC6 | 999 | 1019 | - | 20.18 | CTCTTCCTCTCTCTCTGCCTC |
Harlequin | DOF3.6 | 1404 | 1424 | - | 20.15 | CATCCTTTTTTTTTTTTTTTT |
Harlequin | DOF5.8 | 1330 | 1348 | - | 19.86 | GTTATTTTTTTTACTTTTT |
Harlequin | BPC5 | 940 | 969 | + | 19.82 | GAAGAGACAGAGAGGAAGAGACAGAGAGAC |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.