Harlequin

Basic information Differential Expression Stage analysis Survival analysis Correlation analysis

DF ID DF0000017
TE superfamily ERV1
TE class LTR
Species Catarrhini
Length 6896
Kimura value 6.23
Tau index 0.9729
Description Internal region of ERV1 endogenous retrovirus, Harlequin subfamily
Comment Harlequin is an internal part of an endogenous retrovirus-like element flanked by LTR2s. Its consensus sequence has been reconstructed based on eight full-length copies; they are ~92% similar to the consensus sequence. One copy has been inserted into Alu-Y repeat. Harlequin was built up from several different retroviruses including HERVE, HERVI, HERV17 and MER4I-group sequences. We suggest that shuffling of non- homologous endogenous retroviral sequences may generate mosaic retrovirus-like elements. Presumably, the shuffling occurs in virions where RNA genomes of different retroviruses could be packed in together. Self-catalytic RNA recombinations or jumping of reverse transcriptase between non-homologous RNA molecules may induce the shuffling.
Sequence
TTTCTTGGTTCCCTGACCGGGAAGCGAGGTGATTAACGGACGGTCGAGGCAGCCCCTTAGGCGGCTTAGGCCTGCCCTGTGGAGCATCCCTGCGGGGGACTCCGGCCAGCCTGAGCGACGCGGATCCTGAGAGCGCTCCCGGGTAGGCAATTGCCCCGGTGGAACGCCTCGCCAGAGCAGCGCGTGGCAGGCCCCCGCGGAGGATCAACGCAGTGGCTGAACACCGGGAAGGAACTGGCACTTGGAGTCCGGACATCTGAAACTTGGTAAGACTGGTCTTTGGAACTTGCCCCACTCCATTTGAGTGGAAGCGTGGCCTGATCACCCACGGCGTGCCTGTACCGGCACTTTGGTTTTTGTTTTTGACTTGACTTGGATTGCTTGATACTTTGGTTTTGGTTTTGACCTGGCTTGGATTTCTGGATACTCTGATTTTGGTTTTGATTTTGGTTTGGTGTAAACTGCAAAAGTGTGTGTGTGCCCTTTTTACCCGTTCTTTGTTTTGTGGTGTGCGTGTGGTGTGAGCGTGGTGTTTTGTCTCGAAGAAGCATGGGTCAGGCACAAAGTAAGCCCACCCCACTAGGAACTATGTTGAAAAATTTCAAGAAAGGATTTAAGGGAGACTACGGNGTTACTATGACACCAGGAAAACTTAGAACTTTGTGTGAAATAGACTGGCCAGCATTAGAGGTGGGTTGGCCATCAGAAGGAAGCCTGGACAGGTCCCTTGTTTCAAAGGTATGGCACAAGGTAACCTGTAAGCCAGGGCACCCAGACCAGTTCCCGTACATAGACACTTGGTTACAGCTGGTTTTAGACCCCCCGCCCCCACAGTGGTTGAGAGAACAGCAGCATAAGCGGCTGGCAGAGGCAAGGAAAGACCAGCAGAGAGANNGAGAGAGGAAAGAGACAGAGAGGAAAAGAGGCAAAGAGAGAGAGGAAGAGACAGAGAGGAAGAGACAGAGAGACAAAGAGGGAGTCAAGGAGAGAGAAAGAGAGAGGCAGAGAGAGAGGAAGAGACAGAGGCAAAAGGAAAGTCAAAGAGAGANAGAAAGTCAAAGAGAGAAAGAAAGAGAGAGATATACAAGTAGTTAAGAAAAAAAACAGTGTACCCTATTCCTTTAAAAGCCAAGGTAAATTTAAAACCTATAATTGATAATTGAAGGTATTCTCCGTAACCCTATAACACTCCAATACCACTTTGTTGTCAGTGTAAACAAGGGCGTATCCCGAAAGCACTGAGGCCTTCCTATCAAAAATCCTTAACCCAGTAACCCGCGGATGGCCCAAATGCATTCAATCTGTAGCGGCAACTGCTTTGCTAACAGAAAAAAGTAAAAAAAATAACTTTTAGAGGAAACCTCATTGTGAGCACACCTCACCAGTTCAGAAGTATCCTAAGGAAAAAAAAAAAAAAAAGGATGATTTAACATTAACCACTGAAAATTCCCTTAACCCAGCAGGTTTCCTAACAGGGGATCTAAATCTTAATTACCATACAAAGGTCCGACCAGACCTAGGAGGAACTCCCTTCAGGACAGGACGATAGATGGTTCCTCCCGGGTAATTGAAGAAAGAAAAAAAGCCATCTATACCAATTCTAAGTTAATTTGGACTAAACAAGGTCTTATTAATAGCAAAGGATAATTGAAATCCCAAACTTACAAGGTTTTCAACAAAAGTAAAGTTTGCTAAAAGTTAACAGTGTAACATGTATTATAGTAACTTCTAATCTTGTGGCCTTAGACAGTCTAGTCCACAGACATAAAGGAAGTTCGCTTTGGAAAAGAATGGTTATCATCTTCGAAAAAAAAAGAGAGGAAAGGGGGGGCAGAATTTATGTAAAAAGAGTGTTATATGGTAAATTCTTGTCCTGAAATAAATTAACTGGTTGTTTAAAGAAAGAAATGTTTGTAATAAGTCAGAAAGTTGAGGCATGTCGAAGAATTGTCTGCGAAAGTCGTGAAAGAAAAAAATGTTATAAAAAAAGAATTTATGCAAGAAATGTTGTATAATTTAAAAGTAATTAGGCCTCCTGAATGTAAAACTATTGAAAAAACAGTTTATGTGCAAGGTGTATAAGGAAAGTAAAATATACCTTTGGTAAAAGGATTATAAGGAGGCATAAGAATGTGGATTTTTACCTACATTAAAAGGTTAAAAAAATTATTGTTTTGAAGGTTTAAGCAAGTTTTAAAACGTTAATTGTAAAGAAAATTCTGTGTGTAAACATATTAGCTAAAGTTAAAGAGGTATCATCCAGTTTTTCTGTGAACTGGACATTAAAGTAAAAACGCAACGGGTTTTTCTTAAAGCACCAACCTGCTCTTTAACAAAAATTATAAAAGGTTAAAAAGAGTCTATAAAAATCTTACCTTATGGTCAAACATTAAAAATTGGATAAATATGTCTACAAGGTTTTATTAAAATTAAGTTTAACATTAATAACACACTAATATAAAGGTGAAATTTAGCTTATCTGGTATAAAAATCATACAAGAAGCATTGTTAAATATAAAATGGTGTTTGGCTTTCTTTGGTCTAAAAACTAATAAAAATAGGTGCTAAAGGAAATTTCTCAGTAAAAAGGCACCAAGGACTATAAAGTCCACTGCCGATGTCCCCACATTTAAAACAAAAGGTCAATTTCTTAGAAATTATATACTTGGTTTATCTTCCACTTTCCTTTCCCTCAAAACTAAAAGTCTTTTAGCACATGTACCACCCCTAGAATTTCCGGTAAACCAGCACCAGCCTGAAGATCACGTTCTCATCAAAGGGTGGAAAGAAGGAAAACTCGAGCCAGCCTGGGAAGGACCCTACCTTGTGCTGCTAACCACCGAGACTGCTGTTCGTACAGCGAAAAAGGGATGGACTCATCACACCCGAGTCAAGAAAGCGCCACCCCCTCCAGAGTCGTGGGCCATAGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAATTTAACTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATTGCTGACCATCTAGTTATTAACATAACCAAGTCAATTTCGCCTCAAACTATTGCATTTAATGCTTGCCTTGTTATACCCTGTGGGGACTTGCCAAGTCAAAGACAGCTCTCTACTTCAGAAAAGTACCTCTGTCCCTCCTGACTCTCCTCAGACTGGGCATTAGTAAATTAGGACCATTTAATCCGGGGAGATTTCGATAAAGACCCCAGTGTCAACCAGGAGTCTTGCCCCCCGATGTAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACTAAAGAGCAAGGATGGACTGCCCCAACCGGTTTTTGTAATTTCCTAAAACCATACATTCATTTTACTAGAGGATCATAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTAAGACAGGATACCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCATCCGCACGTTAAACAAAAGCAATTGTTATGCTTGTGCACATGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAAGGTGGTCCTCCAGTCGACCAGGCGTGGGCTGCATGGTAGCTCTTTTCCAGGATTCTACAGCCTGGAGTAATAAGTCGTGCCAAGCTCTCTCTGCTATATCCCGAAGTCCGGCACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCAACACTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTGAAGGGATGCAGTGAGCTTAAGAATTTTCAAGAGCTTATCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGGCACTCTGCCGAATAACTGGAGTGGCACTTGTGCTTTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGGGAGGAAAAATAAGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTCTATTTAGACGCAATTGGAGTCCCACGGGGAATACCAGATCAATTTAAAGCTTGAAATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAGTTAATAAAAATGTAGATTGGATAAACTACATCTATTACAACCAACAGCAACGAGCTTTTCATGAGTTAAAAGAAAAACTCATGTCGGCCCCAGCCCTGGGGCTACCTGACCTGACAAAACCCTTTACACTCTATGTGTCAGAAAGAGAAAAAATGGCAGTTGGAGTTTTAACCCAGACTGTGGGGCCCTGGCCAAGGCCAGTGGCCTATCTCTCAAAACAACTAGACGGGGTTTCCAAAGGCTGGCCCCCATGTCTAAGGGCCCTGGCAGCAACGGCCCTGTTAGCACAAGAAGCAGATAAACTAACCCTTGGGCAAAACCTGAATATAAAGGCCCCCCATGCTGTGGTAACTTTAATGAATACCAAAGGACATCATTGGCTAACGAATGCTAGATTAACCAAGTACCAAAGCTTGCTATGTGAAAATCCCCGCATAACCATTGAAGTTTGCAACACCCTAAACCCCGCCACCTTGCTCCCGGTATCAGAGAGCCCAGTTGAACATAACTGTGTAGAGGTGTTGGACTCAGTTTATTCTAGCGGGCCCAACCTCCGAGACCATCCTTGAACATCAGTAGACTGTGAGCTGTACGTGGACGGGAGCAGCTTCGCCAACCCCTGCAAAGTGACTCTGAAGAAGACGACAAGCCCTGCTCCAGTCACACCCGGAAGCTGACTGGTCCACGCACGGCCGAAGCATGAGAAAACTCATCGCGGGACTCATTTTCCTTAAAATTTGGACTTGTACAGTAAGGACTTCAACTGACCTTCCTCAGACTGAGGACTGTTCCCAGTGTATACATCAAGTCACTGAGGTAGGACAAAAGGTTGCTACGGTCCTATTATTTTATGGTTATTATAAGTGTACCGGAACTCTAAAAAGAACTTGTTTGTATAATGTTATTCTATACAAGGTATGTAGCCCAGGAAATGACCAACCTGATGTGTGTTATGACCCATCTGAGCCTCCCATGACCACAGTTTTTAAAATAAGATTAAGGACTGAGGACTGGTGGGGGCTCATAAACGATACGAGTAAAGTGTTAGCCAAAACAGAAGAAAAAGGGGTGCCCAAACAAGTCACCTTGAAATTTGATGCCTGTGCTGTCATTAATAGTAATAAGTNTAGGAATAGGATGTGGTTCTCTTAATTNNAGGAAAGAGGCTATATGGCAGAAAATAAGTACATTTGTCATGAATTAGGACTGTGTGGAAATAAATGTGGATACTGGTCTTGTGTCATTTAGGCTACTTGGATAAAAAATGAAAAGGATCCTGTCCACCTTCAGAAAGGGAAAAGTGGCCCTTCCTGTACCAGTGGTCAGTGTAACCCCTTAGAACTAGTAATAACCAACCCCCTTGATCCTCGCTGGAAAAAAGGGGAGCGTGTAACCCTAGGAATCGATGGGGCCGGACTGGATCCTCGAGTAAATATCGTGGTTCGAGGAGAAGTTTATAAACGCTCTCCTGAGCCAGTATTTCAAACCTTCTATGATGAACTGAATGTGCCAGTACCAGAAATTCCAGGAAAAACAAGAAATTTGTTTTTGCAATTAGCCGAGCATGTAGCCCAGTCTCTCAATGTCACTTCATGTTATGTATGTGGAGGAACTGTAATGGGAGATCAATGGCCATGGGAAGCCCGAGAATTAGTACCTACAGACCCAGTTCCTGATGAATTCCCGGCTCAAAAGAATCACCCTGATAACTTCTGGGTCCTAAAAGCCTCAATCATTAGACAATACTGTATAGCAAGAGTGGGGAAGGACTTCACCCTTCCCGTGGGAAGACTCAGCTGCCTTGGGCAAAAACTGTATAATAGTACTACAAAAACAGCCACCTGGTGGAGTTCAAACCACACTAAGAAAAATCCATTTAGTAAATTCCCAAAGTTGCAAACCGTGTGGACCCACCCGGAGTCCCACCGGGACTGGACAGCCCCCACTGGATTATACTGGATATGTGGGCATAGAGCTTACGCCAAATTACCCGACCAGTGGGCAGGTAGTTGTGTTATTGGCACTATTAAACCATCTTTCTTCCTACTGCCCATAAAGACAGGCGAACTCCTGGGCTTCCCTGTCTATGCTTCCCGCGAAAAGAGAAGCATAGCTATAGGAAATTGGAAAGATGATAAATGGCCCCCTGAGAGAATCATACAATATTATGGGCCTGCTACTTGGGCACAAGACGGCTCGTGGGGATACCGGACCCCCATTTACATGCTCAACCGAATCATACGGTTACAAGCTGTCTTAGAAATAATCACTAATAAGACCGGCAGAGCCTTGACTATTCTGGCCCGGCAAGAAACTCAGATGAGAAATGCTATCTATCAAAATAGATTGGCTCTCGACTACTTGCTAGCAGCTGAAGGAGGGGTCTGTAGGAAATTTAACCTTACTAATTGCTGCCTACACATAGATGATCAAGGGCAAGTAGTTGAAGACATAGTTAGAGATATGACAAAACTGGCACATGTGCCCGTGCAAGTGTGGCATGGATTTGATCCTGGGGCCATGTTTGGAAAATGGTTCCCAGCGCTAGGAGGATTTAAAACTCTTATAATAGGAGTTATAATAGTAATAGGAACCTGCTTACTGCTCCCTTGTTTGCTACCTGTACTTCTTCAAATGATAAAAAGCTTCATCGCTACCTTAGTTCACCAAAATGCTTCAGCACAAGTGTACTATATGAATCACTATCGATCTGTCTTGCAAGAAGACATGGGTAGTGAGAATGAAAGTGAGAACTCCCACTANTGAGTGAGATTCTCAAAGGGGGGGAA



TF motifs of the concenus sequence

Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.

TE_family TFBS Start End Strand Score Matched sequence
Harlequin BPC6 991 1011 - 20.87 CTCTCTCTGCCTCTCTCTTTC
Harlequin BPC5 996 1025 + 20.80 AGAGAGGCAGAGAGAGAGGAAGAGACAGAG
Harlequin BPC1 945 968 + 20.73 GACAGAGAGGAAGAGACAGAGAGA
Harlequin BPC1 925 948 + 20.64 GGCAAAGAGAGAGAGGAAGAGACA
Harlequin BPC1 1053 1076 + 20.53 AAGTCAAAGAGAGAAAGAAAGAGA
Harlequin DOF5.1 1330 1348 + 20.41 AAAAAGTAAAAAAAATAAC
Harlequin BPC6 999 1019 - 20.18 CTCTTCCTCTCTCTCTGCCTC
Harlequin DOF3.6 1404 1424 - 20.15 CATCCTTTTTTTTTTTTTTTT
Harlequin DOF5.8 1330 1348 - 19.86 GTTATTTTTTTTACTTTTT
Harlequin BPC5 940 969 + 19.82 GAAGAGACAGAGAGGAAGAGACAGAGAGAC


TFBS enrichment in GRCh38

Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.




GTEx

The promoter activity across 46 body sites from The Genotype-Tissue Expression (GTEx) project.




TCGA

The promoter activity across 33 cancer types from The Cancer Genome Atlas (TCGA).