HERVH

Basic information Differential Expression Stage analysis Survival analysis Correlation analysis

DF ID DF0000183
TE superfamily ERV1
TE class LTR
Species Catarrhini
Length 7713
Kimura value 5.61
Tau index 0.8930
Description Internal region of ERV1 endogenous retrovirus, HERVH subfamily
Comment The long terminal repeat associated with HERVH is LTR7. The putative env gene, consisting of about 1800 base pairs, has two open reading frames interrupted by a termination codon. The amino acid sequence of this region shows significant homology to those of other retroviral envelope proteins and contains eight potential glycosylation sites. It is estimated that there are about 100 copies of RTVL-H elements containing the env gene per haploid human genome. Note that this estimate is distinct from simply tallying all hits (long and short) from this model.
Sequence
TTTGGTGCCGTGACTCGGATCGGGGGACCTCCCTTGGGAGATCAATCCCCTGTCCTCCTGCTCTTTGCTCCGTGAGAAAGATCCACCTACGACCTCAGGTCCTCAGACCGACCAGCCCAAGGAACATCTCACCAATTTCAAATCCGGTAAGCGGCCTCTTTTTACTCTCTTCTCCAACCTCCCTCACTATCCCTCAACCTCTTTCTCCTTTCAATCTTGGCGCCACACTTCAATCTCTCCCTTCTCTTAATTTCAATTCCTTTCATTTTCTGGTAGAGACAAAGGAGACACGTTTTATCCGTGGACCCAAAACTCCGGCGCCGGTCACGGACTGGGAAGGCAGCCTTCCCTTGGTGTTTAATCATTGCAGGGACGCCTCTCTGATTATTCACCCACGTTTCAGAGGTGTCAGACCACGCAGGGACGCCTGCCTTGGTCCTTCACCCTTAGCGGCAAGTCCCGCTTTTCTGGGGGAGGGGCAAGTACCCCAACCCCTTCTCTCCGTGTCTCTACCCCTTCTCCGCTTTTCTGGGGNAGGGGCAAGNACCCCTCAACCCCTTCTCCTTCACCCTTAGCGGCAAGTCCCGCTTTTCTAGGGGGGCAAGAACCCCCAACCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCAATCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCGATCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCAACCCCTTATTTCCGTGCCCCGACCCCTTTCCCGCTTTTCTGGAGGGCAAGAACCCCCGAACCCCTTCCCTCCGTGTCTCTACTCTCTCTTTTCTCTGGGCTTGCCTCCTTCACTATGGGCAACCTTCCACCCTCCATTCCTCCTTCTCCCTTAGCCTGTGTTCTCAAGAACTTAAAACCTCTTCAACTCACACCTGACCTAAAACCTAAACGCCTTATTTTCTTCTGCAATGCCGCTTGACCCCAATACAAACTCGACAGTAGTTCCAAATAGCCAGAAAACGGCACTTTCAATTTTTCCATCCTACAAGATCTAAATAATTCTTGTCGTAAAATGGGCAAACGGTCTGAGGTGCCTGACGTCCAGGCATTCTTTTACACATCGGTCCCTCCCTAGTCTCTGTNCCCAGTGCAACTCGTCCCAAATCTTCCTTCTTTCCCTCCCGCCTGTCCCCTCAGTCCCAACCCCAAGCGTCGCTGAGTCTTTCTAATCTTCCTTTTCTACAGACCCATCTGACCTCTCCCCTCCTCGCCAGGCCGAGCTAGGTCCCAATTCTTCCTCAGCCTCCGCTCCTCCACCCTATAATCCTTTTATCACCTCCCCTCCTCACACCCGGTCCGGCTTACAGTTTCGTTCCGTGACTAGCCCTCCCCCACCTGCCCAGCAATTTACTCTTAAAAAGGTGGCTGGAGCTAAAGGCATAGTCAAGGTTAATGCTCCTTTTTCTTTATCCGNNNNNTCCCAAATCAGATAGCGTTTAGGCTCTTTTTCATCAAATATAAAAANCCCAGCCCAGTTCATGGCTCGTTTGGCAGCAACCCTGAGACGCTTTACAGCCCTAGACCCTAAAAGGTCAAAAGGCCGTCTTATTCTCAATATACATTTTATTACCCAATCTGCTCCCGACATTAAATAAAACTCCAAAAATTAAATTCCGGCCCTCAAACCCCACAACAGGACTTAATTAACCTCGCCTTCAAGGTGTACAATAATAGAAAAAAGTTGCAATTCCTTGCCTCCACTGTGAGACAAACCCCAGCCACATCTCCAGCACACAAGAACTTCCAAACGCCTGAACCGCAGCGGCCAGGCGTTCCTCCAGAACCTCCTCCCCCAGGAGCTTGCTACAAGTGCCGGAAATCTGGCCACCGGGCCAAGGAATGCCCGCAGCCCGGGATTCCTCCTAAGCCGCGTCCCATCTGTGCGGGACCCCACTGGAAATCGGACTGTTCAACTCACCTGGCAGCCACTCCCAGAGCCCCTGGAACTCTGGCCCAAGGCTCTCTGACTGACTCCTTCCCAGATCTTCTCGGCTTAGCGGCTGAAGACTGACGCTGCCCGATCGCCTCGGAAGCCCCCTGGACCATCACGGACGCTTCGGGTAACTCTCACAGTGGAGGGTAAGTCCGTCCCCTTCTTAATCAATACGGAGGCTACCCACTCCACATTACCTTCTTTTCAAGGGCCTGTTTCCCTTGCCTCCATAACTGTTGTGGGTATTGACGGCCAGGCTTCTAAACCTCTTAAAACTCCCCAACTCTGGTGCCAACTTAGACAATACTCTTTTAAGCACTCCTTTTTAGTTATCCCCACCTGCCCAGTTCCCTTATTAGGCCGAGACACTTTAACTAAATTATCTGCTTCCCTGACTATTCCTGGACTACAGCCACATCTCATTGCCGCCCTTCTCCCCAATCCAAAGCCTCCTTCGCGTCCTCTTGTATCCCCCCACCTTAACCCACAAGTATAAGATACCTCTACTCCCTCCTTGGCGACCGATCATGCACCCCTTACCATCTCATTAAAACCTAATCACCCTTACCCCGCTCAACGCCAATATCCCATCCCACAGCACGCTTTAAAAGGATTAAAGCCTGTTATCACTCGCCTGCTACAGCATGGCCTTTTAAAGCCTATAAACTCTCCTTACAATTCCCCCATTTTACCTGTCCTAAAACCAGACAAGCCTTACAGGTTCAGGATCTGCGCCTTATCAACCAAATTGTTTTGCCTATCCACCCCGTGGTGCCAAACCCATATACTCTCCTATCCTCAATACCTCCCTCCACAACCCATTATTCTGTTCTGGATCTCAAACATGCTTTCTTTACTATTCCTTTGCACCCTTCATCCCAGCCTCTCTTCGCTTTCACTTGGACTGACCCTGACACCCATCAGGCTCAGCAAATTACCTGGGCTGTACTGCCGCAAGGCTTCACAGACAGCCCCCATTACTTCAGTCAAGCCCTTCCTCATGATTTACTTTCTTTCCACCCCTCCGCTTCTCACCTTATTCAATATATTGATGACCTTCTNCTTTGTAGCCCCTCCTTTGAATCTTCTCAACAAGACACNCTNCTGCTCCTTCANCATTTATTCTCCAAAGGATATCCCCCTCCAAAGCCCAAATTTCTTCCTCATCTGTTACCTATCTCGGCATAATTCTTCATAAAAACACACGTGCTCTCCCTGCCGATCGTGTCCGACTGATCTCTCAAACCCCAACCCCTTCTACAAAACAACAACTCCTTTCCTTCCTAGGCATGGTTGGATACTTTCGCCTTTGGATACCTGGTTTTGCCATCCTAACAAAACCATTATATAAACTCACAAAAGGAAACCTAGCTGACCCCATAGATCCTAAATCCTTTCCCCACTCCTCTTTCCGTTCCTTGAAGACAGCTTTAGAGACTGCCCCCACCCTAGCTCTCCCTGACTCATCCCAACCCTTTTCATTACACACAGCCGAAGTGCAGGGCTGTGCAGTCGGAATTCTTACACAAGGACCGGGACCGCGCCCTGTAGCCTTTTTGTCCAAACAACTTGACCTTACTGTTTTAGGCTGGCCATCATGTCTCCGTGCGGCGGCTGCCGCCGCCCTAATACTTTTAGAGGCCCTCAAAATCACAAACTATGCTCAACTCACTCTCTACAGTTCTCATAACTTCCAAAATCTATTTTCTTCCTCACACCTGACGCATATACTTTCTGCTCCCCGGCTCCTTCAGCTGTACTCACTCTGTTGAGTCTCCCACAATTACCATTGTTCCTGGCCCGGACTTCAATCCGGCCTCCCACATTATTCCTGATACCACACCTGACCCCCATGACTGTATCTCTCTGATCCACCTGACATTCACCCCATTTCCCCATATTTCCTTCTTTCCTGTTCCTCACCCTGATCACACTTGGTTTATTGATGGCAGTTCCACCAGGCCTAATCGCCACACACCAGCAAAGGCAGGCTATGCTATGAACTCGTTGCCTTAACTCGAGCCCTCACTCTTGCAAAGGGACTACGCGTCAATATTTATACTGACTCTAAATATGCCTTCCATATCCTGCACCACCATGCTGTTATATGGGCTGAAAGAGGTTTCCTCACTACGCAAGGGTCCTCCATCATTAATGCCTCTTTAAAAAAACTCTTCTCAAGGCCGCTTTACTTCCAAAGGAAGCTGGAGTCATTCACTGCAAGGGCCATCAAAAGGCATCAGATCCCATCGCTCAGGGCAACGCTTATGCTGATAAGGTAGCTAAAGAAGCAGCTAGCGTTCCAACTTCTGTCCCTCACGGCCAGTTTTTCTCCTTCTCATCGGTCACTCCCACCTACTCCCCCGCTGAAACTTCCACCTATCAATCTCTTCCCACACAAGGCAAATGGTTCTTGGACCAAGGAAAATATCTCCTTCCAGCCTCACAGGCCCATTCTATTCTGTCGTCATTTCATAACCTCTTCCATGTAGGTTACAAGCCGCTAGCCCGCCTCTTAGAACCTCTCATTTCCTTTCCATCGTGGAAATCTATCCTCAAGGAAATCACTTCTCAGTGTTCCATCTGCTATTCTACTACTCCTCAGGGATTGTTCAGGCCCCCTCCCTTCCCTACACATCAAGCTCGGGGATTTGCCCCCGCCCAGGACTGGCAAATTGACTTTACTCACATGCCCCGAGTCAGGAAACTAAAATACCTCTTGGTCTGGGTAGACACTTTCACTGGATGGGTAGAGGCCTTTCCCACAGGGTCTGAGAAGGCCACCGCGGTCATTTCTTCCCTTCTGTCAGACATAATTCCTCGGTTTGGCCTTCCCACCTCTATACAGTCCGATAACGGACCGGCCTTTATTAGTCAAATCAGCCAAGCAGTTTCTCAGGCTCTTGGTATTCAGTGAAACCTTTATATCCCTTACGGTCCTCAGTCTTCAGGAAAGGTAGAACGGACTAATGGTCTTTTAAAAACACACCTCACCAAGCTCAGCCACCAACTTAAAAAGGACTGGACAATACTTTTACCACTTTCCCTTCTCAGAATTCGGGCCTGTCCTCGGAATGCTACAGGGTACAGCCCATTTGAGCTCCTGTATGGACGCTCCTTTTTATTAGGCCCCAGTCTCATTCCAGACACCAGACCTCTAGGCGACTATCTTCCAGTCCTCCAGCAGGCTAGACAGGAAATTCGCCAGGCTGCTAATCTTCTCTTGCCTACTCCAGATCCCCAGCCATATGAAGACACCCTAGCTGGACGATCAGTTCTTGTTAAGAATCTGACCCCTCAAACTCTACAACCTCGATGGACCGGACCCTACTTAGTCATCTATAGTACCCCGACTGCCGTCCGCCTGCAGGATCCTCCCCACTGGGTTCACCGTTCCAGAATAAAGCTGTGTCCGTCGGACAGCCAGCCTAATCCCTCCTCTTCCTCCTGGAAGTCGCAAGTACTCTCCCCTACTTCCCTTAAACTCACTCGCATTTCTGAAGAACAGTAATAACCCTTATGAGCCTAATACATCCCTTCATTCTATTAGGTCTGTTCGTCCTTACCCTACTTTTTGCAACAGGGCTTTACGNAGTCACCCCCACCACTTGGACCGAGCCCCAAAAAACTTGTCATCCCTACTATCTTCTGTCTAGTCATACTCCTATTCNCCGTTCTCAACTACTCATAAATGCCCTACTCTTGTTTACACTGCCGGTTTACACTGTTTCTCCAAGCCATCACAGCTGATATCTCCTGGTGCTATCCCCAAACCGCCACTCTTAACTCCCTCTTAGAGTGGATAGATGATCTTTGCTGGCAGGGCACCCTCCAATACTTTCACCCTGATGAAGTTCTATTCTTTACTTTTATACTCACTCTTATTCTCATTCCCATTCTTATGCCACCCTCTACCTCTCCCCAGCTATCTCCACCACACTATCAACCTTACCCATTCTCTCCTAGCCGTTTCTAATCCCTCCTTAGCGAACAACTGCTGGCTTTGCATTTCCCTTTCTTCCAGCGCCTACACAGCTGTCCCCGCCTTACANACAGACTGGGCAACATCTCCTGTCTCCCTACACCTCCGAACTTCCTTTAACAGCCCTCACCTTTACCCTCCTGAAGAACTCATTTACTTTCTAGACAGGTCCAGCAAGACCTCCCCAGACATTTCACATCAGCAAGCTGCCGCCCTCCTCCGCACTTACTTAAAAAACCTTTCTCCTTATATCAACTCTACTCCCCCCATATTTGGACCTCTCACAACACAAACTACTATTCCTGTGGCCGCTCCTTTATGTATCTCTCGGCAAAGACCCACTGGAATTCCCCTAGGTAACCTTTCACCTTCTCGATGTTCCTTTACTCTTCATCTCCGAAGCCCAACTACACACATCACTGAAACAATTGGAGCCTTCCAGCTCCATATTACAGACAAGCCCTCTATCAATACTGGCAAACTTAAAAACATTAGCAGTAATTATTGCTTAGGAAGACACTTACCCTGTATTTCACTCCATCCTTGGCTACCTTCCCCTTGCTCGTCAGACTCTCCTCCCAGGCCCTCTTCTTGTTTACTTATACCCAGCCCCGAAAATAACAGTGAAAGGTTGCTCGTAGATACTCAACGTTTTCTCATACACCATGAAAATCGAACCTCCCCCTCTACGCAGTTACCCCATCAGTCCCCATTACAACCTCTGACGGCTGCCGCCCTAGCTGGATCCCTAGGAGTCTGGGTACAAGACACCCCTTTCAGCACTCCTTCTCATCTTTTTACTTTGCATCTCCAGTTTTGCCTCGCACAAGGTCTCTTCTTCCTCTGTGGATCCTCTACCTACATGTGTCTACCTGCTAATTGGACAGGCACATGCACACTAGTTTTCCTTACCCCCAAAATTCAATTTGCAAATGGGACCGAAGAGCTCCCTGTTCCCCTCATGACACCGACACGACAAAAAAGAGTTATTCCACTAATTCCCTTGCTNGTCGGTTTAGGACTTTCTGCCTCCACTATTGCTCTCGGTACTGGAATAGCAGGCATTTCAACCTCTGTCACGACCTTCCGTAGCCTCTCTAATGACTTCTCTGCTAGCATCACAGACATATCACAAACTTTATCAGTCCTCCAGGCCCAAGTTGACTCTTTAGCTGCAGTTGTCCTCCAAAACCGCCGAGGCCTCGACTTACTCACTGCTGAAAAAGGAGGACTCTGTATATTCTTAAATGAAGAGTGTTGTTTTTACCTAAATCAATCTGGCCTGGTGTATGACAACATAAAAAAACTCAAGGATAGAGCCCAAAAACTCGCCAACCAAGCAAGTAATTACGCTGAACCCCCTTGGGCACTCTCTAATTGGATGTCCTGGGTCCTCCCAATTCTTAGTCCTTTAATACCCGTTTTTCTCCTTCTCTTATTCGGACCTTGTGTCTTCCGTTTAGTTTCTCAATTCATNCAAAACCGTATCCAGGCCATCACCAATCATTCTATACGACAAATGCTCCTTCTAACAACCCCACAATATCACCCCTTACCACAAAATCTTCCTTCAGCTTAATCTCTCCCACTCTAGGTTCCCACGCCGCCCCTAATCCCGCTCGAAGCAGCCCTGAGAAACATCGCCCATTATCTCTCNNCATACCACCCCCCAAAAATTTTCGCCGCCCCAACACTTCANCACTATTTTATTTTTCTTATTAATATAAGAAGACAGGAA



TF motifs of the concenus sequence

Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.

TE_family TFBS Start End Strand Score Matched sequence
HERVH KLF15 4649 4656 + 16.27 CCCCGCCC
HERVH ZNF816 1180 1194 - 16.25 AGGGGACAGGCGGGA
HERVH RVE6 4418 4426 - 16.23 AGATATTTT
HERVH RVE5 4418 4426 - 16.21 AGATATTTT
HERVH DREB2F 3608 3618 + 16.17 CTGCCGCCGCC
HERVH Sox7 3780 3789 - 16.17 GGAACAATGG
HERVH ZFP14 5427 5441 - 16.16 GGAGGAAGAGGAGGG
HERVH KLF14 4649 4657 - 16.15 TGGGCGGGG
HERVH KLF5 471 480 - 16.10 GCCCCTCCCC
HERVH MAZ 4612 4619 + 16.08 CCCCTCCC


TFBS enrichment in GRCh38

Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.




GTEx

The promoter activity across 46 body sites from The Genotype-Tissue Expression (GTEx) project.




TCGA

The promoter activity across 33 cancer types from The Cancer Genome Atlas (TCGA).