8556 Manual annotation (determined on a case-by-case basis) from the Havana project. Havana 1 {'multi_name' => 'GENCODE 24 Comprehensive gene set','colour_key' => '[biotype]','caption' => 'Genes (Comprehensive set from GENCODE 24)','name' => 'Comprehensive Gene Annotations from GENCODE 24','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8557 Sequences from various databases are matched to Ensembl transcripts using Exonerate. These are external references, or 'Xrefs'. DNA match 0 \N 8558 Proteins from the UniProtKB Swiss-Prot database, aligned to the genome by Havana. UniProt proteins 0 \N 8559 Xref mapping based on checksum equivalency Xref checksum 0 \N 8560 Alignment of human ESTs (expressed sequence tags) to the genome using the program Est2genome. ESTs are from dbEST Human EST (EST2genome) 0 {'type' => 'est'} 8561 Non-coding RNAs (ncRNAs) predicted using sequences from RFAM and miRBase. See article. ncRNAs 1 {'multi_name' => 'GENCODE 24 Comprehensive gene set','colour_key' => '[biotype]','caption' => 'Genes (Comprehensive set from GENCODE 24)','name' => 'Comprehensive Gene Annotations from GENCODE 24','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8562 Positions of ncRNAs (non-coding RNAs) from the Rfam database are shown. Initial BLASTN hits of genomic sequence to RFAM ncRNAs are clustered and filtered by E value. These hits are supporting evidence for ncRNA genes. RFAM ncRNAs 0 \N 8563 Positions of miRNAs along the genome are shown. A BLASTN of genomic sequence regions against miRBase sequences is performed, and hits are clustered and filtered by E value. Aligned genomic sequence is then checked for possible secondary structure using RNAFold. If evidence is found that the genomic sequence could form a stable hairpin structure the locus is used to create a miRNA gene model. The resulting BLAST hit is used as supporting evidence for the miRNA gene. See article. miRNAs (miRBase) 0 \N 8564 Proteins from the UniProtKB TrEMBL database, aligned to the genome by Havana. TrEMBL proteins 0 \N 8565 Positions of vertebrate mRNAs along the genome. mRNAs are from the European Nucleotide Archive database. Initial alignments are performed using TBLASTN of Genscan-predicted peptides against the European Nucleotide Archive mRNAs. Vertebrate cDNAs (ENA) 0 {'type' => 'cdna','default' => {'contigviewbottom' => 'stack'}} 8566 Annotation for this gene includes both automatic annotation from Ensembl and Havana manual curation, see article. Ensembl/Havana merge 1 {'multi_name' => 'GENCODE 24 Comprehensive gene set','colour_key' => '[biotype]','caption' => 'Genes (Comprehensive set from GENCODE 24)','name' => 'Comprehensive Gene Annotations from GENCODE 24','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8567 Transcript where the Ensembl genebuild transcript and the Vega manual annotation have the same sequence, for every base pair. See article. Ensembl/Havana merge 1 {'multi_name' => 'GENCODE 24 Comprehensive gene set','colour_key' => '[biotype]','caption' => 'Genes (Comprehensive set from GENCODE 24)','name' => 'Comprehensive Gene Annotations from GENCODE 24','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8568 Homo Sapiens cDNAs from NCBI RefSeq and EMBL are aligned to the genome using Exonerate cdna2genome model. Human cDNAs (cdna2genome) 0 {'type' => 'cdna'} 8569 match Protein 0 \N 8570 Gene3D analysis as of interpro_scan.pl Gene3D 1 {'type' => 'domain'} 8571 Protein domains and motifs in the SUPERFAMILY database. Superfamily domains 1 {'type' => 'domain'} 8572 HMM-Panther families hmmpanther 1 {'type' => 'domain'} 8573 Protein coding sequences agreed upon by the Consensus Coding Sequence project, or CCDS. CCDS set 0 {'dna_align_feature' => {'do_not_display' => '1'},'type' => 'cdna','default' => {'contigviewbottom' => 'normal'}} 8574 Identification of peptide low complexity sequences by Seg. Low complexity (Seg) 1 \N 8575 Prediction of signal peptide cleavage sites by SignalP. Cleavage site (Signalp) 1 \N 8576 Protein domains and motifs in the Pfam database. Pfam domain 1 {'type' => 'domain'} 8577 Prediction of transmembrane helices in proteins by TMHMM. Transmembrane helices 1 \N 8578 Protein domains and motifs in the SMART database. SMART domains 1 {'type' => 'domain'} 8579 Protein domains and motifs from the PROSITE profiles database are aligned to the genome. PROSITE profiles 1 {'type' => 'domain'} 8580 Annotation produced by the Ensembl genebuild. Ensembl 1 {'multi_name' => 'GENCODE 24 Comprehensive gene set','colour_key' => '[biotype]','caption' => 'Genes (Comprehensive set from GENCODE 24)','name' => 'Comprehensive Gene Annotations from GENCODE 24','label_key' => '[biotype]','default' => {'MultiTop' => 'gene_label','contigviewbottom' => 'transcript_label','MultiBottom' => 'collapsed_label','contigviewtop' => 'gene_label','cytoview' => 'gene_label','alignsliceviewbottom' => 'as_collapsed_label'},'key' => 'ensembl'} 8581 Human cDNAs from NCBI RefSeq and ENA are aligned to the genome using Exonerate. Human cDNAs 0 {'type' => 'cdna'} 8582 Human protein sequences from UniProtKB and NCBI RefSeq are aligned to the genome using GeneWise or Exonerate. Human proteins 0 \N 8583 Protein domains and motifs from the PIR (Protein Information Resource) Superfamily database. PIRSF domain 1 {'type' => 'domain'} 8584 Protein domains and motifs from the PROSITE profiles database are aligned to the genome. PROSITE patterns 1 {'type' => 'domain'} 8585 Protein fingerprints (groups of conserved motifs) are aligned to the genome. These motifs come from the PRINTS database. Prints domain 1 {'type' => 'domain'} 8586 Prediction of coiled-coil regions in proteins is by Ncoils. Coiled-coils (Ncoils) 1 \N