HERVH

Basic information Differential Expression Stage analysis Survival analysis Correlation analysis

DF ID DF0000183
TE superfamily ERV1
TE class LTR
Species Catarrhini
Length 7713
Kimura value 5.61
Tau index 0.8930
Description Internal region of ERV1 endogenous retrovirus, HERVH subfamily
Comment The long terminal repeat associated with HERVH is LTR7. The putative env gene, consisting of about 1800 base pairs, has two open reading frames interrupted by a termination codon. The amino acid sequence of this region shows significant homology to those of other retroviral envelope proteins and contains eight potential glycosylation sites. It is estimated that there are about 100 copies of RTVL-H elements containing the env gene per haploid human genome. Note that this estimate is distinct from simply tallying all hits (long and short) from this model.
Sequence
TTTGGTGCCGTGACTCGGATCGGGGGACCTCCCTTGGGAGATCAATCCCCTGTCCTCCTGCTCTTTGCTCCGTGAGAAAGATCCACCTACGACCTCAGGTCCTCAGACCGACCAGCCCAAGGAACATCTCACCAATTTCAAATCCGGTAAGCGGCCTCTTTTTACTCTCTTCTCCAACCTCCCTCACTATCCCTCAACCTCTTTCTCCTTTCAATCTTGGCGCCACACTTCAATCTCTCCCTTCTCTTAATTTCAATTCCTTTCATTTTCTGGTAGAGACAAAGGAGACACGTTTTATCCGTGGACCCAAAACTCCGGCGCCGGTCACGGACTGGGAAGGCAGCCTTCCCTTGGTGTTTAATCATTGCAGGGACGCCTCTCTGATTATTCACCCACGTTTCAGAGGTGTCAGACCACGCAGGGACGCCTGCCTTGGTCCTTCACCCTTAGCGGCAAGTCCCGCTTTTCTGGGGGAGGGGCAAGTACCCCAACCCCTTCTCTCCGTGTCTCTACCCCTTCTCCGCTTTTCTGGGGNAGGGGCAAGNACCCCTCAACCCCTTCTCCTTCACCCTTAGCGGCAAGTCCCGCTTTTCTAGGGGGGCAAGAACCCCCAACCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCAATCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCGATCCCTTATTTCCGCGCCCCGACCTCTTATCTCTGCGCCCCAACCCCTTATTTCCGTGCCCCGACCCCTTTCCCGCTTTTCTGGAGGGCAAGAACCCCCGAACCCCTTCCCTCCGTGTCTCTACTCTCTCTTTTCTCTGGGCTTGCCTCCTTCACTATGGGCAACCTTCCACCCTCCATTCCTCCTTCTCCCTTAGCCTGTGTTCTCAAGAACTTAAAACCTCTTCAACTCACACCTGACCTAAAACCTAAACGCCTTATTTTCTTCTGCAATGCCGCTTGACCCCAATACAAACTCGACAGTAGTTCCAAATAGCCAGAAAACGGCACTTTCAATTTTTCCATCCTACAAGATCTAAATAATTCTTGTCGTAAAATGGGCAAACGGTCTGAGGTGCCTGACGTCCAGGCATTCTTTTACACATCGGTCCCTCCCTAGTCTCTGTNCCCAGTGCAACTCGTCCCAAATCTTCCTTCTTTCCCTCCCGCCTGTCCCCTCAGTCCCAACCCCAAGCGTCGCTGAGTCTTTCTAATCTTCCTTTTCTACAGACCCATCTGACCTCTCCCCTCCTCGCCAGGCCGAGCTAGGTCCCAATTCTTCCTCAGCCTCCGCTCCTCCACCCTATAATCCTTTTATCACCTCCCCTCCTCACACCCGGTCCGGCTTACAGTTTCGTTCCGTGACTAGCCCTCCCCCACCTGCCCAGCAATTTACTCTTAAAAAGGTGGCTGGAGCTAAAGGCATAGTCAAGGTTAATGCTCCTTTTTCTTTATCCGNNNNNTCCCAAATCAGATAGCGTTTAGGCTCTTTTTCATCAAATATAAAAANCCCAGCCCAGTTCATGGCTCGTTTGGCAGCAACCCTGAGACGCTTTACAGCCCTAGACCCTAAAAGGTCAAAAGGCCGTCTTATTCTCAATATACATTTTATTACCCAATCTGCTCCCGACATTAAATAAAACTCCAAAAATTAAATTCCGGCCCTCAAACCCCACAACAGGACTTAATTAACCTCGCCTTCAAGGTGTACAATAATAGAAAAAAGTTGCAATTCCTTGCCTCCACTGTGAGACAAACCCCAGCCACATCTCCAGCACACAAGAACTTCCAAACGCCTGAACCGCAGCGGCCAGGCGTTCCTCCAGAACCTCCTCCCCCAGGAGCTTGCTACAAGTGCCGGAAATCTGGCCACCGGGCCAAGGAATGCCCGCAGCCCGGGATTCCTCCTAAGCCGCGTCCCATCTGTGCGGGACCCCACTGGAAATCGGACTGTTCAACTCACCTGGCAGCCACTCCCAGAGCCCCTGGAACTCTGGCCCAAGGCTCTCTGACTGACTCCTTCCCAGATCTTCTCGGCTTAGCGGCTGAAGACTGACGCTGCCCGATCGCCTCGGAAGCCCCCTGGACCATCACGGACGCTTCGGGTAACTCTCACAGTGGAGGGTAAGTCCGTCCCCTTCTTAATCAATACGGAGGCTACCCACTCCACATTACCTTCTTTTCAAGGGCCTGTTTCCCTTGCCTCCATAACTGTTGTGGGTATTGACGGCCAGGCTTCTAAACCTCTTAAAACTCCCCAACTCTGGTGCCAACTTAGACAATACTCTTTTAAGCACTCCTTTTTAGTTATCCCCACCTGCCCAGTTCCCTTATTAGGCCGAGACACTTTAACTAAATTATCTGCTTCCCTGACTATTCCTGGACTACAGCCACATCTCATTGCCGCCCTTCTCCCCAATCCAAAGCCTCCTTCGCGTCCTCTTGTATCCCCCCACCTTAACCCACAAGTATAAGATACCTCTACTCCCTCCTTGGCGACCGATCATGCACCCCTTACCATCTCATTAAAACCTAATCACCCTTACCCCGCTCAACGCCAATATCCCATCCCACAGCACGCTTTAAAAGGATTAAAGCCTGTTATCACTCGCCTGCTACAGCATGGCCTTTTAAAGCCTATAAACTCTCCTTACAATTCCCCCATTTTACCTGTCCTAAAACCAGACAAGCCTTACAGGTTCAGGATCTGCGCCTTATCAACCAAATTGTTTTGCCTATCCACCCCGTGGTGCCAAACCCATATACTCTCCTATCCTCAATACCTCCCTCCACAACCCATTATTCTGTTCTGGATCTCAAACATGCTTTCTTTACTATTCCTTTGCACCCTTCATCCCAGCCTCTCTTCGCTTTCACTTGGACTGACCCTGACACCCATCAGGCTCAGCAAATTACCTGGGCTGTACTGCCGCAAGGCTTCACAGACAGCCCCCATTACTTCAGTCAAGCCCTTCCTCATGATTTACTTTCTTTCCACCCCTCCGCTTCTCACCTTATTCAATATATTGATGACCTTCTNCTTTGTAGCCCCTCCTTTGAATCTTCTCAACAAGACACNCTNCTGCTCCTTCANCATTTATTCTCCAAAGGATATCCCCCTCCAAAGCCCAAATTTCTTCCTCATCTGTTACCTATCTCGGCATAATTCTTCATAAAAACACACGTGCTCTCCCTGCCGATCGTGTCCGACTGATCTCTCAAACCCCAACCCCTTCTACAAAACAACAACTCCTTTCCTTCCTAGGCATGGTTGGATACTTTCGCCTTTGGATACCTGGTTTTGCCATCCTAACAAAACCATTATATAAACTCACAAAAGGAAACCTAGCTGACCCCATAGATCCTAAATCCTTTCCCCACTCCTCTTTCCGTTCCTTGAAGACAGCTTTAGAGACTGCCCCCACCCTAGCTCTCCCTGACTCATCCCAACCCTTTTCATTACACACAGCCGAAGTGCAGGGCTGTGCAGTCGGAATTCTTACACAAGGACCGGGACCGCGCCCTGTAGCCTTTTTGTCCAAACAACTTGACCTTACTGTTTTAGGCTGGCCATCATGTCTCCGTGCGGCGGCTGCCGCCGCCCTAATACTTTTAGAGGCCCTCAAAATCACAAACTATGCTCAACTCACTCTCTACAGTTCTCATAACTTCCAAAATCTATTTTCTTCCTCACACCTGACGCATATACTTTCTGCTCCCCGGCTCCTTCAGCTGTACTCACTCTGTTGAGTCTCCCACAATTACCATTGTTCCTGGCCCGGACTTCAATCCGGCCTCCCACATTATTCCTGATACCACACCTGACCCCCATGACTGTATCTCTCTGATCCACCTGACATTCACCCCATTTCCCCATATTTCCTTCTTTCCTGTTCCTCACCCTGATCACACTTGGTTTATTGATGGCAGTTCCACCAGGCCTAATCGCCACACACCAGCAAAGGCAGGCTATGCTATGAACTCGTTGCCTTAACTCGAGCCCTCACTCTTGCAAAGGGACTACGCGTCAATATTTATACTGACTCTAAATATGCCTTCCATATCCTGCACCACCATGCTGTTATATGGGCTGAAAGAGGTTTCCTCACTACGCAAGGGTCCTCCATCATTAATGCCTCTTTAAAAAAACTCTTCTCAAGGCCGCTTTACTTCCAAAGGAAGCTGGAGTCATTCACTGCAAGGGCCATCAAAAGGCATCAGATCCCATCGCTCAGGGCAACGCTTATGCTGATAAGGTAGCTAAAGAAGCAGCTAGCGTTCCAACTTCTGTCCCTCACGGCCAGTTTTTCTCCTTCTCATCGGTCACTCCCACCTACTCCCCCGCTGAAACTTCCACCTATCAATCTCTTCCCACACAAGGCAAATGGTTCTTGGACCAAGGAAAATATCTCCTTCCAGCCTCACAGGCCCATTCTATTCTGTCGTCATTTCATAACCTCTTCCATGTAGGTTACAAGCCGCTAGCCCGCCTCTTAGAACCTCTCATTTCCTTTCCATCGTGGAAATCTATCCTCAAGGAAATCACTTCTCAGTGTTCCATCTGCTATTCTACTACTCCTCAGGGATTGTTCAGGCCCCCTCCCTTCCCTACACATCAAGCTCGGGGATTTGCCCCCGCCCAGGACTGGCAAATTGACTTTACTCACATGCCCCGAGTCAGGAAACTAAAATACCTCTTGGTCTGGGTAGACACTTTCACTGGATGGGTAGAGGCCTTTCCCACAGGGTCTGAGAAGGCCACCGCGGTCATTTCTTCCCTTCTGTCAGACATAATTCCTCGGTTTGGCCTTCCCACCTCTATACAGTCCGATAACGGACCGGCCTTTATTAGTCAAATCAGCCAAGCAGTTTCTCAGGCTCTTGGTATTCAGTGAAACCTTTATATCCCTTACGGTCCTCAGTCTTCAGGAAAGGTAGAACGGACTAATGGTCTTTTAAAAACACACCTCACCAAGCTCAGCCACCAACTTAAAAAGGACTGGACAATACTTTTACCACTTTCCCTTCTCAGAATTCGGGCCTGTCCTCGGAATGCTACAGGGTACAGCCCATTTGAGCTCCTGTATGGACGCTCCTTTTTATTAGGCCCCAGTCTCATTCCAGACACCAGACCTCTAGGCGACTATCTTCCAGTCCTCCAGCAGGCTAGACAGGAAATTCGCCAGGCTGCTAATCTTCTCTTGCCTACTCCAGATCCCCAGCCATATGAAGACACCCTAGCTGGACGATCAGTTCTTGTTAAGAATCTGACCCCTCAAACTCTACAACCTCGATGGACCGGACCCTACTTAGTCATCTATAGTACCCCGACTGCCGTCCGCCTGCAGGATCCTCCCCACTGGGTTCACCGTTCCAGAATAAAGCTGTGTCCGTCGGACAGCCAGCCTAATCCCTCCTCTTCCTCCTGGAAGTCGCAAGTACTCTCCCCTACTTCCCTTAAACTCACTCGCATTTCTGAAGAACAGTAATAACCCTTATGAGCCTAATACATCCCTTCATTCTATTAGGTCTGTTCGTCCTTACCCTACTTTTTGCAACAGGGCTTTACGNAGTCACCCCCACCACTTGGACCGAGCCCCAAAAAACTTGTCATCCCTACTATCTTCTGTCTAGTCATACTCCTATTCNCCGTTCTCAACTACTCATAAATGCCCTACTCTTGTTTACACTGCCGGTTTACACTGTTTCTCCAAGCCATCACAGCTGATATCTCCTGGTGCTATCCCCAAACCGCCACTCTTAACTCCCTCTTAGAGTGGATAGATGATCTTTGCTGGCAGGGCACCCTCCAATACTTTCACCCTGATGAAGTTCTATTCTTTACTTTTATACTCACTCTTATTCTCATTCCCATTCTTATGCCACCCTCTACCTCTCCCCAGCTATCTCCACCACACTATCAACCTTACCCATTCTCTCCTAGCCGTTTCTAATCCCTCCTTAGCGAACAACTGCTGGCTTTGCATTTCCCTTTCTTCCAGCGCCTACACAGCTGTCCCCGCCTTACANACAGACTGGGCAACATCTCCTGTCTCCCTACACCTCCGAACTTCCTTTAACAGCCCTCACCTTTACCCTCCTGAAGAACTCATTTACTTTCTAGACAGGTCCAGCAAGACCTCCCCAGACATTTCACATCAGCAAGCTGCCGCCCTCCTCCGCACTTACTTAAAAAACCTTTCTCCTTATATCAACTCTACTCCCCCCATATTTGGACCTCTCACAACACAAACTACTATTCCTGTGGCCGCTCCTTTATGTATCTCTCGGCAAAGACCCACTGGAATTCCCCTAGGTAACCTTTCACCTTCTCGATGTTCCTTTACTCTTCATCTCCGAAGCCCAACTACACACATCACTGAAACAATTGGAGCCTTCCAGCTCCATATTACAGACAAGCCCTCTATCAATACTGGCAAACTTAAAAACATTAGCAGTAATTATTGCTTAGGAAGACACTTACCCTGTATTTCACTCCATCCTTGGCTACCTTCCCCTTGCTCGTCAGACTCTCCTCCCAGGCCCTCTTCTTGTTTACTTATACCCAGCCCCGAAAATAACAGTGAAAGGTTGCTCGTAGATACTCAACGTTTTCTCATACACCATGAAAATCGAACCTCCCCCTCTACGCAGTTACCCCATCAGTCCCCATTACAACCTCTGACGGCTGCCGCCCTAGCTGGATCCCTAGGAGTCTGGGTACAAGACACCCCTTTCAGCACTCCTTCTCATCTTTTTACTTTGCATCTCCAGTTTTGCCTCGCACAAGGTCTCTTCTTCCTCTGTGGATCCTCTACCTACATGTGTCTACCTGCTAATTGGACAGGCACATGCACACTAGTTTTCCTTACCCCCAAAATTCAATTTGCAAATGGGACCGAAGAGCTCCCTGTTCCCCTCATGACACCGACACGACAAAAAAGAGTTATTCCACTAATTCCCTTGCTNGTCGGTTTAGGACTTTCTGCCTCCACTATTGCTCTCGGTACTGGAATAGCAGGCATTTCAACCTCTGTCACGACCTTCCGTAGCCTCTCTAATGACTTCTCTGCTAGCATCACAGACATATCACAAACTTTATCAGTCCTCCAGGCCCAAGTTGACTCTTTAGCTGCAGTTGTCCTCCAAAACCGCCGAGGCCTCGACTTACTCACTGCTGAAAAAGGAGGACTCTGTATATTCTTAAATGAAGAGTGTTGTTTTTACCTAAATCAATCTGGCCTGGTGTATGACAACATAAAAAAACTCAAGGATAGAGCCCAAAAACTCGCCAACCAAGCAAGTAATTACGCTGAACCCCCTTGGGCACTCTCTAATTGGATGTCCTGGGTCCTCCCAATTCTTAGTCCTTTAATACCCGTTTTTCTCCTTCTCTTATTCGGACCTTGTGTCTTCCGTTTAGTTTCTCAATTCATNCAAAACCGTATCCAGGCCATCACCAATCATTCTATACGACAAATGCTCCTTCTAACAACCCCACAATATCACCCCTTACCACAAAATCTTCCTTCAGCTTAATCTCTCCCACTCTAGGTTCCCACGCCGCCCCTAATCCCGCTCGAAGCAGCCCTGAGAAACATCGCCCATTATCTCTCNNCATACCACCCCCCAAAAATTTTCGCCGCCCCAACACTTCANCACTATTTTATTTTTCTTATTAATATAAGAAGACAGGAA



TF motifs of the concenus sequence

Use FIMO to detect transcription factor motifs in the concenus sequence of the TE family.

TE_family TFBS Start End Strand Score Matched sequence
HERVH ZNF281 470 479 + 20.07 GGGGGAGGGG
HERVH ZNF93 3605 3618 - 19.17 GGCGGCGGCAGCCG
HERVH EREB29 3610 3619 + 19.06 GCCGCCGCCC
HERVH ZNF148 470 479 - 18.97 CCCCTCCCCC
HERVH Zm00001d049364 3608 3618 + 18.57 CTGCCGCCGCC
HERVH Wt1 1386 1395 + 18.54 CCTCCCCCAC
HERVH PATZ1 470 480 + 18.42 GGGGGAGGGGC
HERVH NR2C2 1253 1266 - 18.39 GAGGGGAGAGGTCA
HERVH ZNF75A 4749 4760 + 17.93 GCCTTTCCCACA
HERVH KLF3 3530 3539 + 17.74 GACCGCGCCC


TFBS enrichment in GRCh38

Use Fisher's exact test to perform enrichment analysis of transcription factor binding sites in the TE family of GRCh38.




GTEx

The promoter activity across 46 body sites from The Genotype-Tissue Expression (GTEx) project.




TCGA

The promoter activity across 33 cancer types from The Cancer Genome Atlas (TCGA).