Harlequin

Basic information Differential Expression Stage analysis Survival analysis Correlation analysis

DF ID DF0000017
TE superfamily ERV1
TE class LTR
Species Catarrhini
Length 6896
Kimura value 6.23
Tau index 0.9729
Description Internal region of ERV1 endogenous retrovirus, Harlequin subfamily
Comment Harlequin is an internal part of an endogenous retrovirus-like element flanked by LTR2s. Its consensus sequence has been reconstructed based on eight full-length copies; they are ~92% similar to the consensus sequence. One copy has been inserted into Alu-Y repeat. Harlequin was built up from several different retroviruses including HERVE, HERVI, HERV17 and MER4I-group sequences. We suggest that shuffling of non- homologous endogenous retroviral sequences may generate mosaic retrovirus-like elements. Presumably, the shuffling occurs in virions where RNA genomes of different retroviruses could be packed in together. Self-catalytic RNA recombinations or jumping of reverse transcriptase between non-homologous RNA molecules may induce the shuffling.
Sequence
TTTCTTGGTTCCCTGACCGGGAAGCGAGGTGATTAACGGACGGTCGAGGCAGCCCCTTAGGCGGCTTAGGCCTGCCCTGTGGAGCATCCCTGCGGGGGACTCCGGCCAGCCTGAGCGACGCGGATCCTGAGAGCGCTCCCGGGTAGGCAATTGCCCCGGTGGAACGCCTCGCCAGAGCAGCGCGTGGCAGGCCCCCGCGGAGGATCAACGCAGTGGCTGAACACCGGGAAGGAACTGGCACTTGGAGTCCGGACATCTGAAACTTGGTAAGACTGGTCTTTGGAACTTGCCCCACTCCATTTGAGTGGAAGCGTGGCCTGATCACCCACGGCGTGCCTGTACCGGCACTTTGGTTTTTGTTTTTGACTTGACTTGGATTGCTTGATACTTTGGTTTTGGTTTTGACCTGGCTTGGATTTCTGGATACTCTGATTTTGGTTTTGATTTTGGTTTGGTGTAAACTGCAAAAGTGTGTGTGTGCCCTTTTTACCCGTTCTTTGTTTTGTGGTGTGCGTGTGGTGTGAGCGTGGTGTTTTGTCTCGAAGAAGCATGGGTCAGGCACAAAGTAAGCCCACCCCACTAGGAACTATGTTGAAAAATTTCAAGAAAGGATTTAAGGGAGACTACGGNGTTACTATGACACCAGGAAAACTTAGAACTTTGTGTGAAATAGACTGGCCAGCATTAGAGGTGGGTTGGCCATCAGAAGGAAGCCTGGACAGGTCCCTTGTTTCAAAGGTATGGCACAAGGTAACCTGTAAGCCAGGGCACCCAGACCAGTTCCCGTACATAGACACTTGGTTACAGCTGGTTTTAGACCCCCCGCCCCCACAGTGGTTGAGAGAACAGCAGCATAAGCGGCTGGCAGAGGCAAGGAAAGACCAGCAGAGAGANNGAGAGAGGAAAGAGACAGAGAGGAAAAGAGGCAAAGAGAGAGAGGAAGAGACAGAGAGGAAGAGACAGAGAGACAAAGAGGGAGTCAAGGAGAGAGAAAGAGAGAGGCAGAGAGAGAGGAAGAGACAGAGGCAAAAGGAAAGTCAAAGAGAGANAGAAAGTCAAAGAGAGAAAGAAAGAGAGAGATATACAAGTAGTTAAGAAAAAAAACAGTGTACCCTATTCCTTTAAAAGCCAAGGTAAATTTAAAACCTATAATTGATAATTGAAGGTATTCTCCGTAACCCTATAACACTCCAATACCACTTTGTTGTCAGTGTAAACAAGGGCGTATCCCGAAAGCACTGAGGCCTTCCTATCAAAAATCCTTAACCCAGTAACCCGCGGATGGCCCAAATGCATTCAATCTGTAGCGGCAACTGCTTTGCTAACAGAAAAAAGTAAAAAAAATAACTTTTAGAGGAAACCTCATTGTGAGCACACCTCACCAGTTCAGAAGTATCCTAAGGAAAAAAAAAAAAAAAAGGATGATTTAACATTAACCACTGAAAATTCCCTTAACCCAGCAGGTTTCCTAACAGGGGATCTAAATCTTAATTACCATACAAAGGTCCGACCAGACCTAGGAGGAACTCCCTTCAGGACAGGACGATAGATGGTTCCTCCCGGGTAATTGAAGAAAGAAAAAAAGCCATCTATACCAATTCTAAGTTAATTTGGACTAAACAAGGTCTTATTAATAGCAAAGGATAATTGAAATCCCAAACTTACAAGGTTTTCAACAAAAGTAAAGTTTGCTAAAAGTTAACAGTGTAACATGTATTATAGTAACTTCTAATCTTGTGGCCTTAGACAGTCTAGTCCACAGACATAAAGGAAGTTCGCTTTGGAAAAGAATGGTTATCATCTTCGAAAAAAAAAGAGAGGAAAGGGGGGGCAGAATTTATGTAAAAAGAGTGTTATATGGTAAATTCTTGTCCTGAAATAAATTAACTGGTTGTTTAAAGAAAGAAATGTTTGTAATAAGTCAGAAAGTTGAGGCATGTCGAAGAATTGTCTGCGAAAGTCGTGAAAGAAAAAAATGTTATAAAAAAAGAATTTATGCAAGAAATGTTGTATAATTTAAAAGTAATTAGGCCTCCTGAATGTAAAACTATTGAAAAAACAGTTTATGTGCAAGGTGTATAAGGAAAGTAAAATATACCTTTGGTAAAAGGATTATAAGGAGGCATAAGAATGTGGATTTTTACCTACATTAAAAGGTTAAAAAAATTATTGTTTTGAAGGTTTAAGCAAGTTTTAAAACGTTAATTGTAAAGAAAATTCTGTGTGTAAACATATTAGCTAAAGTTAAAGAGGTATCATCCAGTTTTTCTGTGAACTGGACATTAAAGTAAAAACGCAACGGGTTTTTCTTAAAGCACCAACCTGCTCTTTAACAAAAATTATAAAAGGTTAAAAAGAGTCTATAAAAATCTTACCTTATGGTCAAACATTAAAAATTGGATAAATATGTCTACAAGGTTTTATTAAAATTAAGTTTAACATTAATAACACACTAATATAAAGGTGAAATTTAGCTTATCTGGTATAAAAATCATACAAGAAGCATTGTTAAATATAAAATGGTGTTTGGCTTTCTTTGGTCTAAAAACTAATAAAAATAGGTGCTAAAGGAAATTTCTCAGTAAAAAGGCACCAAGGACTATAAAGTCCACTGCCGATGTCCCCACATTTAAAACAAAAGGTCAATTTCTTAGAAATTATATACTTGGTTTATCTTCCACTTTCCTTTCCCTCAAAACTAAAAGTCTTTTAGCACATGTACCACCCCTAGAATTTCCGGTAAACCAGCACCAGCCTGAAGATCACGTTCTCATCAAAGGGTGGAAAGAAGGAAAACTCGAGCCAGCCTGGGAAGGACCCTACCTTGTGCTGCTAACCACCGAGACTGCTGTTCGTACAGCGAAAAAGGGATGGACTCATCACACCCGAGTCAAGAAAGCGCCACCCCCTCCAGAGTCGTGGGCCATAGTCCCAGGGGAAAACCCTACCAAACTAAAGCTAAGAAAAATTTAACTCTTTCATCTATTCTATTACTCTTTCTTCTTTCCTCGCTCTATTGCTGACCATCTAGTTATTAACATAACCAAGTCAATTTCGCCTCAAACTATTGCATTTAATGCTTGCCTTGTTATACCCTGTGGGGACTTGCCAAGTCAAAGACAGCTCTCTACTTCAGAAAAGTACCTCTGTCCCTCCTGACTCTCCTCAGACTGGGCATTAGTAAATTAGGACCATTTAATCCGGGGAGATTTCGATAAAGACCCCAGTGTCAACCAGGAGTCTTGCCCCCCGATGTAGAGCTTTTATGCCGTAGTTGGTCCAACGTTCTGTGGACCACTAAAGAGCAAGGATGGACTGCCCCAACCGGTTTTTGTAATTTCCTAAAACCATACATTCATTTTACTAGAGGATCATAGAAGTTAAAGACTTAAAACAAACTTTGGCAATTAAGACAGGATACCAAGATGCAAATGCCTGGTTGGAATGGATCAAATATTCCATCCGCACGTTAAACAAAAGCAATTGTTATGCTTGTGCACATGGCAGGCCAGAGGCCCAGATTGTCCCCTTTCCACTAAGGTGGTCCTCCAGTCGACCAGGCGTGGGCTGCATGGTAGCTCTTTTCCAGGATTCTACAGCCTGGAGTAATAAGTCGTGCCAAGCTCTCTCTGCTATATCCCGAAGTCCGGCACCCTGCGGGTCAGCCCCCGAGGGCCATCCAGCTTCCGTCTCCCAACACTAAGTTCACTTCGTGTCTCTCACGACAGGGAGGAAACTTAGCGTTCCTTGGAGACCTGAAGGGATGCAGTGAGCTTAAGAATTTTCAAGAGCTTATCAATCAGTCAGCCCTTGTTCATCCCCGAGCGGATGTGTGGTGGTATTGTGGTGGACCTTTACTGGGCACTCTGCCGAATAACTGGAGTGGCACTTGTGCTTTAGTCCAATTGGCTATCCCTTTCACCCTGGCATTTCATCAACCAGAGGGAGGAAAAATAAGACATCGTAAAGCGAGAGAAGCCCCTTATGGGTCTTTCGACTCTCACGTCTATTTAGACGCAATTGGAGTCCCACGGGGAATACCAGATCAATTTAAAGCTTGAAATCAAATAGCTGCAGGATTTGAGTCAATATTTTGGTGGGTGACAGTTAATAAAAATGTAGATTGGATAAACTACATCTATTACAACCAACAGCAACGAGCTTTTCATGAGTTAAAAGAAAAACTCATGTCGGCCCCAGCCCTGGGGCTACCTGACCTGACAAAACCCTTTACACTCTATGTGTCAGAAAGAGAAAAAATGGCAGTTGGAGTTTTAACCCAGACTGTGGGGCCCTGGCCAAGGCCAGTGGCCTATCTCTCAAAACAACTAGACGGGGTTTCCAAAGGCTGGCCCCCATGTCTAAGGGCCCTGGCAGCAACGGCCCTGTTAGCACAAGAAGCAGATAAACTAACCCTTGGGCAAAACCTGAATATAAAGGCCCCCCATGCTGTGGTAACTTTAATGAATACCAAAGGACATCATTGGCTAACGAATGCTAGATTAACCAAGTACCAAAGCTTGCTATGTGAAAATCCCCGCATAACCATTGAAGTTTGCAACACCCTAAACCCCGCCACCTTGCTCCCGGTATCAGAGAGCCCAGTTGAACATAACTGTGTAGAGGTGTTGGACTCAGTTTATTCTAGCGGGCCCAACCTCCGAGACCATCCTTGAACATCAGTAGACTGTGAGCTGTACGTGGACGGGAGCAGCTTCGCCAACCCCTGCAAAGTGACTCTGAAGAAGACGACAAGCCCTGCTCCAGTCACACCCGGAAGCTGACTGGTCCACGCACGGCCGAAGCATGAGAAAACTCATCGCGGGACTCATTTTCCTTAAAATTTGGACTTGTACAGTAAGGACTTCAACTGACCTTCCTCAGACTGAGGACTGTTCCCAGTGTATACATCAAGTCACTGAGGTAGGACAAAAGGTTGCTACGGTCCTATTATTTTATGGTTATTATAAGTGTACCGGAACTCTAAAAAGAACTTGTTTGTATAATGTTATTCTATACAAGGTATGTAGCCCAGGAAATGACCAACCTGATGTGTGTTATGACCCATCTGAGCCTCCCATGACCACAGTTTTTAAAATAAGATTAAGGACTGAGGACTGGTGGGGGCTCATAAACGATACGAGTAAAGTGTTAGCCAAAACAGAAGAAAAAGGGGTGCCCAAACAAGTCACCTTGAAATTTGATGCCTGTGCTGTCATTAATAGTAATAAGTNTAGGAATAGGATGTGGTTCTCTTAATTNNAGGAAAGAGGCTATATGGCAGAAAATAAGTACATTTGTCATGAATTAGGACTGTGTGGAAATAAATGTGGATACTGGTCTTGTGTCATTTAGGCTACTTGGATAAAAAATGAAAAGGATCCTGTCCACCTTCAGAAAGGGAAAAGTGGCCCTTCCTGTACCAGTGGTCAGTGTAACCCCTTAGAACTAGTAATAACCAACCCCCTTGATCCTCGCTGGAAAAAAGGGGAGCGTGTAACCCTAGGAATCGATGGGGCCGGACTGGATCCTCGAGTAAATATCGTGGTTCGAGGAGAAGTTTATAAACGCTCTCCTGAGCCAGTATTTCAAACCTTCTATGATGAACTGAATGTGCCAGTACCAGAAATTCCAGGAAAAACAAGAAATTTGTTTTTGCAATTAGCCGAGCATGTAGCCCAGTCTCTCAATGTCACTTCATGTTATGTATGTGGAGGAACTGTAATGGGAGATCAATGGCCATGGGAAGCCCGAGAATTAGTACCTACAGACCCAGTTCCTGATGAATTCCCGGCTCAAAAGAATCACCCTGATAACTTCTGGGTCCTAAAAGCCTCAATCATTAGACAATACTGTATAGCAAGAGTGGGGAAGGACTTCACCCTTCCCGTGGGAAGACTCAGCTGCCTTGGGCAAAAACTGTATAATAGTACTACAAAAACAGCCACCTGGTGGAGTTCAAACCACACTAAGAAAAATCCATTTAGTAAATTCCCAAAGTTGCAAACCGTGTGGACCCACCCGGAGTCCCACCGGGACTGGACAGCCCCCACTGGATTATACTGGATATGTGGGCATAGAGCTTACGCCAAATTACCCGACCAGTGGGCAGGTAGTTGTGTTATTGGCACTATTAAACCATCTTTCTTCCTACTGCCCATAAAGACAGGCGAACTCCTGGGCTTCCCTGTCTATGCTTCCCGCGAAAAGAGAAGCATAGCTATAGGAAATTGGAAAGATGATAAATGGCCCCCTGAGAGAATCATACAATATTATGGGCCTGCTACTTGGGCACAAGACGGCTCGTGGGGATACCGGACCCCCATTTACATGCTCAACCGAATCATACGGTTACAAGCTGTCTTAGAAATAATCACTAATAAGACCGGCAGAGCCTTGACTATTCTGGCCCGGCAAGAAACTCAGATGAGAAATGCTATCTATCAAAATAGATTGGCTCTCGACTACTTGCTAGCAGCTGAAGGAGGGGTCTGTAGGAAATTTAACCTTACTAATTGCTGCCTACACATAGATGATCAAGGGCAAGTAGTTGAAGACATAGTTAGAGATATGACAAAACTGGCACATGTGCCCGTGCAAGTGTGGCATGGATTTGATCCTGGGGCCATGTTTGGAAAATGGTTCCCAGCGCTAGGAGGATTTAAAACTCTTATAATAGGAGTTATAATAGTAATAGGAACCTGCTTACTGCTCCCTTGTTTGCTACCTGTACTTCTTCAAATGATAAAAAGCTTCATCGCTACCTTAGTTCACCAAAATGCTTCAGCACAAGTGTACTATATGAATCACTATCGATCTGTCTTGCAAGAAGACATGGGTAGTGAGAATGAAAGTGAGAACTCCCACTANTGAGTGAGATTCTCAAAGGGGGGGAA



TF motifs of the concenus sequence

Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.

TE_family TFBS Start End Strand Score Matched sequence
Harlequin BPC1 935 958 + 24.39 GAGAGGAAGAGACAGAGAGGAAGA
Harlequin BPC1 1055 1078 + 24.17 GTCAAAGAGAGAAAGAAAGAGAGA
Harlequin BPC1 1001 1024 + 24.06 GGCAGAGAGAGAGGAAGAGACAGA
Harlequin BPC5 994 1023 + 23.99 AGAGAGAGGCAGAGAGAGAGGAAGAGACAG
Harlequin BPC1 989 1012 + 23.42 GAGAAAGAGAGAGGCAGAGAGAGA
Harlequin BPC6 987 1007 - 23.28 CTCTGCCTCTCTCTTTCTCTC
Harlequin BPC1 929 952 + 22.76 AAGAGAGAGAGGAAGAGACAGAGA
Harlequin BPC1 949 972 + 22.74 GAGAGGAAGAGACAGAGAGACAAA
Harlequin BPC1 931 954 + 22.67 GAGAGAGAGGAAGAGACAGAGAGG
Harlequin BPC1 977 1000 + 22.58 GAGTCAAGGAGAGAGAAAGAGAGA


TFBS enrichment in GRCh38

Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.




GTEx

The promoter activity across 46 body sites from The Genotype-Tissue Expression (GTEx) project.




TCGA

The promoter activity across 33 cancer types from The Cancer Genome Atlas (TCGA).