Harlequin
Basic information Differential Expression Stage analysis Survival analysis Correlation analysisDF ID | DF0000017 |
---|---|
TE superfamily | ERV1 |
TE class | LTR |
Species | Catarrhini |
Length | 6896 |
Kimura value | 6.23 |
Tau index | 0.9729 |
Description | Internal region of ERV1 endogenous retrovirus, Harlequin subfamily |
Comment | Harlequin is an internal part of an endogenous retrovirus-like element flanked by LTR2s. Its consensus sequence has been reconstructed based on eight full-length copies; they are ~92% similar to the consensus sequence. One copy has been inserted into Alu-Y repeat. Harlequin was built up from several different retroviruses including HERVE, HERVI, HERV17 and MER4I-group sequences. We suggest that shuffling of non- homologous endogenous retroviral sequences may generate mosaic retrovirus-like elements. Presumably, the shuffling occurs in virions where RNA genomes of different retroviruses could be packed in together. Self-catalytic RNA recombinations or jumping of reverse transcriptase between non-homologous RNA molecules may induce the shuffling. |
Sequence |
TTTCTTGGTTCCCTGACCGGGAAGCGAGGTGATTAACGGACGGTCGAGGCAGCCCCTTAGGCGGCTTAGGCCTGCCCTGTGGAGCATCCCTGCGGGGGACTCCGGCCAGCCTGAGCGACGCGGATCCTGAGAGCGCTCCCGGGTAGGCAATTGCCCCGGTGGAACGCCTCGCCAGAGCAGCGCGTGGCAGGCCCCCGCGGAGGATCAACGCAGTGGCTGAACACCGGGAAGGAACTGGCACTTGGAGTCCGGACATCTGAAACTTGGTAAGACTGGTCTTTGGAACTTGCCCCACTCCATTTGAGTGGAAGCGTGGCCTGATCACCCACGGCGTGCCTGTACCGGCACTTTGGTTTTTGTTTTTGACTTGACTTGGATTGCTTGATACTTTGGTTTTGGTTTTGACCTGGCTTGGATTTCTGGATACTCTGATTTTGGTTTTGATTTTGGTTTGGTGTAAACTGCAAAAGTGTGTGTGTGCCCTTTTTACCCGTTCTTTGTTTTGTGGTGTGCGTGTGGTGTGAGCGTGGTGTTTTGTCTCGAAGAAGCATGGGTCAGGCACAAAGTAAGCCCACCCCACTAGGAACTATGTTGAAAAATTTCAAGAAAGGATTTAAGGGAGACTACGGNGTTACTATGACACCAGGAAAACTTAGAACTTTGTGTGAAATAGACTGGCCAGCATTAGAGGTGGGTTGGCCATCAGAAGGAAGCCTGGACAGGTCCCTTGTTTCAAAGGTATGGCACAAGGTAACCTGTAAGCCAGGGCACCCAGACCAGTTCCCGTACATAGACACTTGGTTACAGCTGGTTTTAGACCCCCCGCCCCCACAGTGGTTGAGAGAACAGCAGCATAAGCGGCTGGCAGAGGCAAGGAAAGACCAGCAGAGAGANNGAGAGAGGAAAGAGACAGAGAGGAAAAGAGGCAAAGAGAGAGAGGAAGAGACAGAGAGGAAGAGACAGAGAGACAAAGAGGGAGTCAAGGAGAGAGAAAGAGAGAGGCAGAGAGAGAGGAAGAGACAGAGGCAAAAGGAAAGTCAAAGAGAGANAGAAAGTCAAAGAGAGAAAGAAAGAGAGAGATATACAAGTAGTTAAGAAAAAAAACAGTGTACCCTATTCCTTTAAAAGCCAAGGTAAATTTAAAACCTATAATTGATAATTGAAGGTATTCTCCGTAACCCTATAACACTCCAATACCACTTTGTTGTCAGTGTAAACAAGGGCGTATCCCGAAAGCACTGAGGCCTTCCTATCAAAAATCCTTAACCCAGTAACCCGCGGATGGCCCAAATGCATTCAATCTGTAGCGGCAACTGCTTTGCTAACAGAAAAAAGTAAAAAAAATAACTTTTAGAGGAAACCTCATTGTGAGCACACCTCACCAGTTCAGAAGTATCCTAAGGAAAAAAAAAAAAAAAAGGATGATTTAACATTAACCACTGAAAATTCCCTTAACCCAGCAGGTTTCCTAACAGGGGATCTAAATCTTAATTACCATACAAAGGTCCGACCAGACCTAGGAGGAACTCCCTTCAGGACAGGACGATAGATGGTTCCTCCCGGGTAATTGAAGAAAGAAAAAAAGCCATCTATACCAATTCTAAGTTAATTTGGACTAAACAAGGTCTTATTAATAGCAAAGGATAATTGAAATCCCAAACTTACAAGGTTTTCAACAAAAGTAAAGTTTGCTAAAAGTTAACAGTGTAACATGTATTATAGTAACTTCTAATCTTGTGGCCTTAGACAGTCTAGTCCACAGACATAAAGGAAGTTCGCTTTGGAAAAGAATGGTTATCATCTTCGAAAAAAAAAGAGAGGAAAGGGGGGGCAGAATTTATGTAAAAAGAGTGTTATATGGTAAATTCTTGTCCTGAAATAAATTAACTGGTTGTTTAAAGAAAGAAATGTTTGTAATAAGTCAGAAAGTTGAGGCATGTCGAAGAATTGTCTGCGAAAGTCGTGAAAGAAAAAAATGTTATAAAAAAAGAATTTATGCAAGAAATGTTGTATAATTTAAAAGTAATTAGGCCTCCTGAATGTAAAACTATTGAAAAAACAGTTTATGTGCAAGGTGTATAAGGAAAGTAAAATATACCTTTGGTAAAAGGATTATAAGGAGGCATAAGAATGTGGATTTTTACCTACATTAAAAGGTTAAAAAAATTATTGTTTTGAAGGTTTAAGCAAGTTTTAAAACGTTAATTGTAAAGAAAATTCTGTGTGTAAACATATTAGCTAAAGTTAAAGAGGTATCATCCAGTTTTTCTGTGAACTGGACATTAAAGTAAAAACGCAACGGGTTTTTCTTAAAGCACCAACCTGCTCTTTAACAAAAATTATAAAAGGTTAAAAAGAGTCTATAAAAATCTTACCTTATGGTCAAACATTAAAAATTGGATAAATATGTCTACAAGGTTTTATTAAAATTAAGTTTAACATTAATAACACACTAATATAAAGGTGAAATTTAGCTTATCTGGTATAAAAATCATACAAGAAGCATTGTTAAATATAAAATGGTGTTTGGCTTTCTTTGGTCTAAAAACTAATAAAAATAGGTGCTAAAGGAAATTTCTCAGTAAAAAGGCACCAAGGACTATAAAGTCCACTGCCGATGTCCCCACATTTAAAACAAAAGGTCAATTTCTTAGAAATTATATACTTGGTTTATCTTCCACTTTCCTTTCCCTCAAAACTAAAAGTCTTTTAGCACATGTACCACCCCTAGAATTTCCGGTAAACCAGCACCAGCCTGAAGATCACGTTCTCATCAAAGGGTGGAAAGAAGGAAAACTCGAGCCAGCCTGGGAAGGACCCTACCTTGTGCTGCTAACCACCGAGACTGCTGTTCGTACAGCGAAAAAGGGATGGACTCATCACACCCGAGTCAAGAAAGCGCCACCCCCTCCAGAGTCGTGGGCCATAGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAATTTAACTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATTGCTGACCATCTAGTTATTAACATAACCAAGTCAATTTCGCCTCAAACTATTGCATTTAATGCTTGCCTTGTTATACCCTGTGGGGACTTGCCAAGTCAAAGACAGCTCTCTACTTCAGAAAAGTACCTCTGTCCCTCCTGACTCTCCTCAGACTGGGCATTAGTAAATTAGGACCATTTAATCCGGGGAGATTTCGATAAAGACCCCAGTGTCAACCAGGAGTCTTGCCCCCCGATGTAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACTAAAGAGCAAGGATGGACTGCCCCAACCGGTTTTTGTAATTTCCTAAAACCATACATTCATTTTACTAGAGGATCATAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTAAGACAGGATACCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCATCCGCACGTTAAACAAAAGCAATTGTTATGCTTGTGCACATGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAAGGTGGTCCTCCAGTCGACCAGGCGTGGGCTGCATGGTAGCTCTTTTCCAGGATTCTACAGCCTGGAGTAATAAGTCGTGCCAAGCTCTCTCTGCTATATCCCGAAGTCCGGCACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCAACACTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTGAAGGGATGCAGTGAGCTTAAGAATTTTCAAGAGCTTATCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGGCACTCTGCCGAATAACTGGAGTGGCACTTGTGCTTTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGGGAGGAAAAATAAGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTCTATTTAGACGCAATTGGAGTCCCACGGGGAATACCAGATCAATTTAAAGCTTGAAATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAGTTAATAAAAATGTAGATTGGATAAACTACATCTATTACAACCAACAGCAACGAGCTTTTCATGAGTTAAAAGAAAAACTCATGTCGGCCCCAGCCCTGGGGCTACCTGACCTGACAAAACCCTTTACACTCTATGTGTCAGAAAGAGAAAAAATGGCAGTTGGAGTTTTAACCCAGACTGTGGGGCCCTGGCCAAGGCCAGTGGCCTATCTCTCAAAACAACTAGACGGGGTTTCCAAAGGCTGGCCCCCATGTCTAAGGGCCCTGGCAGCAACGGCCCTGTTAGCACAAGAAGCAGATAAACTAACCCTTGGGCAAAACCTGAATATAAAGGCCCCCCATGCTGTGGTAACTTTAATGAATACCAAAGGACATCATTGGCTAACGAATGCTAGATTAACCAAGTACCAAAGCTTGCTATGTGAAAATCCCCGCATAACCATTGAAGTTTGCAACACCCTAAACCCCGCCACCTTGCTCCCGGTATCAGAGAGCCCAGTTGAACATAACTGTGTAGAGGTGTTGGACTCAGTTTATTCTAGCGGGCCCAACCTCCGAGACCATCCTTGAACATCAGTAGACTGTGAGCTGTACGTGGACGGGAGCAGCTTCGCCAACCCCTGCAAAGTGACTCTGAAGAAGACGACAAGCCCTGCTCCAGTCACACCCGGAAGCTGACTGGTCCACGCACGGCCGAAGCATGAGAAAACTCATCGCGGGACTCATTTTCCTTAAAATTTGGACTTGTACAGTAAGGACTTCAACTGACCTTCCTCAGACTGAGGACTGTTCCCAGTGTATACATCAAGTCACTGAGGTAGGACAAAAGGTTGCTACGGTCCTATTATTTTATGGTTATTATAAGTGTACCGGAACTCTAAAAAGAACTTGTTTGTATAATGTTATTCTATACAAGGTATGTAGCCCAGGAAATGACCAACCTGATGTGTGTTATGACCCATCTGAGCCTCCCATGACCACAGTTTTTAAAATAAGATTAAGGACTGAGGACTGGTGGGGGCTCATAAACGATACGAGTAAAGTGTTAGCCAAAACAGAAGAAAAAGGGGTGCCCAAACAAGTCACCTTGAAATTTGATGCCTGTGCTGTCATTAATAGTAATAAGTNTAGGAATAGGATGTGGTTCTCTTAATTNNAGGAAAGAGGCTATATGGCAGAAAATAAGTACATTTGTCATGAATTAGGACTGTGTGGAAATAAATGTGGATACTGGTCTTGTGTCATTTAGGCTACTTGGATAAAAAATGAAAAGGATCCTGTCCACCTTCAGAAAGGGAAAAGTGGCCCTTCCTGTACCAGTGGTCAGTGTAACCCCTTAGAACTAGTAATAACCAACCCCCTTGATCCTCGCTGGAAAAAAGGGGAGCGTGTAACCCTAGGAATCGATGGGGCCGGACTGGATCCTCGAGTAAATATCGTGGTTCGAGGAGAAGTTTATAAACGCTCTCCTGAGCCAGTATTTCAAACCTTCTATGATGAACTGAATGTGCCAGTACCAGAAATTCCAGGAAAAACAAGAAATTTGTTTTTGCAATTAGCCGAGCATGTAGCCCAGTCTCTCAATGTCACTTCATGTTATGTATGTGGAGGAACTGTAATGGGAGATCAATGGCCATGGGAAGCCCGAGAATTAGTACCTACAGACCCAGTTCCTGATGAATTCCCGGCTCAAAAGAATCACCCTGATAACTTCTGGGTCCTAAAAGCCTCAATCATTAGACAATACTGTATAGCAAGAGTGGGGAAGGACTTCACCCTTCCCGTGGGAAGACTCAGCTGCCTTGGGCAAAAACTGTATAATAGTACTACAAAAACAGCCACCTGGTGGAGTTCAAACCACACTAAGAAAAATCCATTTAGTAAATTCCCAAAGTTGCAAACCGTGTGGACCCACCCGGAGTCCCACCGGGACTGGACAGCCCCCACTGGATTATACTGGATATGTGGGCATAGAGCTTACGCCAAATTACCCGACCAGTGGGCAGGTAGTTGTGTTATTGGCACTATTAAACCATCTTTCTTCCTACTGCCCATAAAGACAGGCGAACTCCTGGGCTTCCCTGTCTATGCTTCCCGCGAAAAGAGAAGCATAGCTATAGGAAATTGGAAAGATGATAAATGGCCCCCTGAGAGAATCATACAATATTATGGGCCTGCTACTTGGGCACAAGACGGCTCGTGGGGATACCGGACCCCCATTTACATGCTCAACCGAATCATACGGTTACAAGCTGTCTTAGAAATAATCACTAATAAGACCGGCAGAGCCTTGACTATTCTGGCCCGGCAAGAAACTCAGATGAGAAATGCTATCTATCAAAATAGATTGGCTCTCGACTACTTGCTAGCAGCTGAAGGAGGGGTCTGTAGGAAATTTAACCTTACTAATTGCTGCCTACACATAGATGATCAAGGGCAAGTAGTTGAAGACATAGTTAGAGATATGACAAAACTGGCACATGTGCCCGTGCAAGTGTGGCATGGATTTGATCCTGGGGCCATGTTTGGAAAATGGTTCCCAGCGCTAGGAGGATTTAAAACTCTTATAATAGGAGTTATAATAGTAATAGGAACCTGCTTACTGCTCCCTTGTTTGCTACCTGTACTTCTTCAAATGATAAAAAGCTTCATCGCTACCTTAGTTCACCAAAATGCTTCAGCACAAGTGTACTATATGAATCACTATCGATCTGTCTTGCAAGAAGACATGGGTAGTGAGAATGAAAGTGAGAACTCCCACTANTGAGTGAGATTCTCAAAGGGGGGGAA
|
TF motifs of the concenus sequence
Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.
TE_family | TFBS | Start | End | Strand | Score | Matched sequence |
---|---|---|---|---|---|---|
Harlequin | BPC1 | 921 | 944 | + | 22.45 | AAGAGGCAAAGAGAGAGAGGAAGA |
Harlequin | BPC1 | 999 | 1022 | + | 22.27 | GAGGCAGAGAGAGAGGAAGAGACA |
Harlequin | BPC6 | 989 | 1009 | - | 22.24 | CTCTCTGCCTCTCTCTTTCTC |
Harlequin | BPC1 | 927 | 950 | + | 22.09 | CAAAGAGAGAGAGGAAGAGACAGA |
Harlequin | BPC1 | 957 | 980 | + | 21.85 | GAGACAGAGAGACAAAGAGGGAGT |
Harlequin | BPC1 | 991 | 1014 | + | 21.47 | GAAAGAGAGAGGCAGAGAGAGAGG |
Harlequin | BPC6 | 957 | 977 | - | 21.46 | CCCTCTTTGTCTCTCTGTCTC |
Harlequin | BPC1 | 979 | 1002 | + | 21.27 | GTCAAGGAGAGAGAAAGAGAGAGG |
Harlequin | BPC6 | 1057 | 1077 | - | 21.26 | CTCTCTTTCTTTCTCTCTTTG |
Harlequin | BPC1 | 923 | 946 | + | 21.14 | GAGGCAAAGAGAGAGAGGAAGAGA |
TFBS enrichment in GRCh38
Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.