Allen Brain AtlasAllen Institute for Brain Science, Seattle, WA, USAThe Allen Brain Atlas (ABA) is an interactive, genome-wide image database of gene expression in the mouse brain. A combination of RNA in situ hybridization data, detailed Reference Atlases and informatics analysis tools are integrated to provide a searchable digital atlas of gene expression.
EBI-ArrayExpress DatabaseEuropean Bioinformatics Institute (EBI)ArrayExpress is a public repository for microarray data, which is aimed at storing well annotated data in accordance with The Microarray Gene Expression Data (MGED) Society recommendations.
Bimolecular Interaction Network DatabaseBlueprint and Mt. Sinai Hospital, TorontoBIND is a collection of records documenting molecular interactions. The contents of BIND include high-throughput data submissions and hand-curated information gathered from the scientific literature.
Basic Local Alignment Search ToolNational Center for Biotechnology Information, Bethesda, MD, USABLAST finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.
NCBI CancerChromosomesNational Center for Biotechnology Information, Bethesda, MD, USAThree databases, the NCI/NCBI SKY/M-FISH & CGH Database, the NCI Mitelman Database of Chromosome Aberrations in Cancer, and the NCI Recurrent Aberrations in Cancer , are integrated into NCBI's Entrez system as Cancer Chromosomes.
Cancer Genome Anatomy ProjectNational Cancer Institute, Bethesda, MD, USAThe NCI's Cancer Genome Anatomy Project seeks to determine the gene expression profiles of normal, precancer, and cancer cells, leading eventually to improved detection, diagnosis, and treatment for the patient. The websites' interconnected modules provide access to all CGAP data, bioinformatic analysis tools, and biological resources allowing the user to rapidly find "in silico" answers to biological questions.
ChemSpiderRoyal Society of ChemistryChemSpider is a free access service providing access to millions of chemical structures and integration to a multitude of other online services.
Chemical Entities of Biological Interest DatabaseEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKChEBI is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. The term ‘molecular entity’ refers to any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity. The molecular entities in question are either products of nature or synthetic products used to intervene in the processes of living organisms.
ClustalWEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKClustal W is a general purpose multiple sequence alignment program for DNA or proteins.It produces biologically meaningful multiple sequence alignments of divergent sequences. It calculates the best match for the selected sequences, and lines them up so that the identities, similarities and differences can be seen. Evolutionary relationships can be seen via viewing Cladograms or Phylograms.
Compute Pi/Mw ToolSwiss Institute of Bioinformatics, Geneva, SwitzerlandCompute pI/Mw is a tool which allows the computation of the theoretical pI (isoelectric point) and Mw (molecular weight) for a list of Swiss-Prot and/or TrEMBL entries or for a user entered sequence.
Enzyme Catalytic Site AtlasEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKThe Catalytic Site Atlas (CSA) is a resource of catalytic sites and residues identified in enzymes using structural data.
The Dali ServerEMBL EBI and Institute of Biotechnology, University of Helsinki, FinlandThe Dali server is a network service for comparing protein structures in 3D.
Human Single Nucleotide Polymorphism DatabaseNational Center for Biotechnology Information, Bethesda, MD, USAThe NCBI website features a listing of known single nucleotide polymorphisms in human genes.
Emap Edinburgh Mouse Atlas ProjectUnited Kingdom MRC Human Genetics Unit in Edinburgh, ScotlandThe emap Atlas is a digital Atlas of mouse embryonic development. It features a series of interactive three-dimensional computer models of mouse embryos at successive stages of development with defined anatomical domains linked to a stage-by-stage ontology of anatomical names.
ENSEMBLEuropean Molecular Biology Lab - European Bioinformatics Institute and the Sanger InstituteEnsembl is a joint project between EMBL - European Bioinformatics Institute (EBI) and the Wellcome Trust Sanger Institute (WTSI) to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes.
NCBI EntrezNational Center for Biotechnology Information, Bethesda, MD, USAThis webportal provides a very extensive list of websites that provide bioinformatics tools and databases.
NCBI Entrez Conserved Domain DatabaseNational Center for Biotechnology Information, Bethesda, MD, USAThe Entrez CDD can be searched to identify proteins that share a conserved interaction domain. It features a collection of multiple sequence alignments for ancient domains and full-length proteins.
NCBI Entrez Gene DatabaseNational Center for Biotechnology Information, Bethesda, MD, USAEntrez Gene is a searchable database of genes, from RefSeq genomes, and defined by sequence and/or located in the NCBI Map Viewer. It does not include all known or predicted genes; instead Entrez Gene focuses on the genomes that have been completely sequenced, that have an active research community to contribute gene-specific information, or that are scheduled for intense sequence analysis. The content (nomenclature, map location, gene products and their attributes, markers, phenotypes, and links to citatio
NCBI Entrez Nucleotides DatabaseNational Center for Biotechnology Information, Bethesda, MD, USAThe Entrez Nucleotides database is a collection of sequences from several sources, including GenBank, RefSeq, and PDB. The number of bases in these databases (well over 130 billion) continues to grow at an exponential rate.
NCBI Entrez Signal Nucleotide Polymorphism DatabaseEntrez - SNP permits searches for SNP's in genes.
NCBI Entrez Protein DatabaseNational Center for Biotechnology Information, Bethesda, MD, USAThe protein entries in the Entrez search and retrieval system have been compiled from a variety of sources, including SwissProt, PIR, PRF, PDB, and translations from annotated coding regions in GenBank and RefSeq.
Exact Antigen Antibody ResourceA curated database of 22,000 monclonal antibody products, hundreds of thousands of product information pages submitted by reagent providers, millions of webpages selected from all 700 reagent suppliers and over 200,000 bioscience-related websites. Antibodies are organized according to genes, species, reagent types (antibodies, phospho-specific antibodies, recombinant proteins, ELISA, siRNA, etc.), patents, and researchers.
GENATLASParis René Descartes University Centre of Bioinformatics, Paris, FranceGENATLAS contains relevant information with respect to gene mapping and genetic diseases
GeneCardsWeizmann Institute of Science and Xennex, Inc.GeneCards is a database of human genes, their products and their involvement in diseases. It offers concise information about the functions of human genes.
Gene TestsNational Institutes of Health, Bethesda, MD, USAThe GeneTests Web site is a publicly funded medical genetics information resource developed for physicians, other healthcare providers, and researchers. It contains online publication of expert-authored disease reviews and other educational resources such as a glossary.
Gene Expression OmnibusNational Center for Biotechnology Information, Bethesda, MD, USAGene Expression Omnibus is a gene expression/molecular abundance repository supporting MIAME compliant data submissions, and a curated, online resource for gene expression data browsing, query and retrieval.
Gene OntologyEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKThe Gene Ontology (GO) Consortium is an international collaboration among scientists at various biological databases, with an Editorial Office based at the EBI. GOA is a project that aims to provide assignments of gene products to the Gene Ontology (GO) resource. The objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products. These terms are to be used as attributes of gene products by collaborating database
Gene Ontology Annotation (GOA) DatabaseEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKThe GOA project aims to provide high-quality Gene Ontology (GO) annotations to proteins in the UniProt Knowledgebase (UniProtKB) and International Protein Index (IPI) and is a central dataset for other major multi-species databases; such as Ensembl and NCBI.
NCBI HomologeneNational Center for Biotechnology Information, Bethesda, MD, USAHomoloGene is a system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes.
Human Protein AtlasAlbanova University Center at the Royal Institute of Technology (KTH, Stockholm) and the Rudbeck Laboratories (Uppsala University), SwedenThe HPR atlas has been created to show the expression and localization of proteins in a large variety of normal human tissues and cancer cells. The data is presented as high resolution images representing immunohistochemically stained tissue sections. Available proteins (genes) can be reached through a specific search (by gene/protein name/id or classification, such as kinase or protease) or by browsing the individual chromosomes.
Human Protein Reference DatabasePandeyLab and Institute of Bioinformatics at John Hopkins University, Baltimore, MD, USAHPRD represents a centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome.
Human Unidentified Gene-Encoded Large Proteins DatabaseKazusa DNA Research Institute, Chiba, JapanThe HUGE protein database focuses on the analysis of cDNA clones encoding particularly large proteins (>50 kDa). The database contains various types of information derived from the predicted primary structure data of newly identified human proteins.
HUPO Human Protein AtlasHuman Proteome Organization Initiative Based in Stockholm and Uppsala, SwedenThe human protein atlas displays expression and localization of proteins in a large variety of normal human tissues and cancer cells. The data is publically available and presented as high resolution images of immunohistochemically stained tissues and cell lines with over 1500 antibodies and over 1.2 million images.
Human Prenylated Proteins DatabaseThe Bioinformatics Group at the IMP Vienna, AustriaHumanPRENbase is a derivate of PRENbase with focus on human prenylated proteins. In PRENbase, paralogous proteins are clustered together in their respective larger family when they are highly similar to each other. A scheme of reciprocal BLASTs were employed in order to identify the true orthologues of a set of prenylated human proteins. 238 individual human proteins and their orthologues form the clusters in HumanPRENbase.
InterActionEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKIntAct is a protein interaction database and analysis system. It provides a query interface and modules to analyse interaction data.
Integrated Relational Enzyme DatabaseEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UK and Swiss Institute of Bioinformatics (SIB)The Integrated relational Enzyme database (IntEnz) contains enzyme data approved by the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB) on the nomenclature and classification of enzyme-catalysed reactions.
Integrated Resources of Proteins Domains and Functional SitesEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKInterPro is a database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.
Integrated Protein Classification DatabaseProtein Information Resource located at Georgetown University Medical CenterThe iProClass is an integrated resource that provides comprehensive family relationships and structural/functional features of proteins.
Jena Centre of Bioinformatics Protein-Protein Interaction WebsiteJena Centre for Bioinformatics,Jena, GermanyN/A
NIH Mammalian Gene CollectionNational Institutes of Health, Bethesda, MD, USAThe Mammalian Gene Collection (MGC) provides full-length open reading frame (FL-ORF) clones for human, mouse, rat, and cow genes.
Mouse Gene Expression Information Resource (MGEIR)European Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKThe Ontology Lookup Service is a spin-off of the PRIDE project, which required a centralized query interface for ontology and controlled vocabulary lookup. While many of the ontologies queriable by the OLS are available online, each has its own query interface and output format.
Molecular Interactions DatabaseMolecular Genetics Group at the University of Rome Tor Vergata, Rome, ItalyMINT is a relational database designed to store interactions between biological molecules. MINT focuses on experimentally verified protein interactions with special emphasis on proteomes from mammalian organisms.
Mouse Atlas of Gene ExpressionBritish Columbia Cancer Agency in Vancouver, CanadaThe Atlas shows the normal state for many tissues by determining, in a comprehensive and quantitative fashion, the number and identity of genes expressed throughout development. It's scope encompasses multiple stages of development, from the single cell zygote to the adult, and includes an extensive initial collection of 200 tissues.
Ontology Lookup ServiceEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKThe Ontology Lookup Service is a spin-off of the PRIDE project, which required a centralized query interface for ontology and controlled vocabulary lookup. While many of the ontologies queriable by the OLS are available online, each has its own query interface and output format.
Online Mendelian Inheritance in ManNCBI - John Hopkins University, Baltimore, MD, USAOMIM is a catalogue of human genes and genetic disorders, authored and edited by staff and scientists at Johns Hopkins University. The database is richly populated with links to PubMed citations, sequence data, and genetic information connected with the gene in question. Information on knockout organisms, overexpression studies and expression in normal and disease states can also be found here, depending on the gene.
Protein and Associated Nucleotide Domains with Inferred TreesEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKPANDIT - Protein and Associated Nucleotide Domains with Inferred Trees. PANDIT is a collection of multiple sequence alignments and phylogenetic trees covering many common protein domains.
Protein DatabankResearch Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, Piscataway, NJ, USAPDB contains the single worldwide repository for the processing and distribution of 3-D biological macromolecular structure data.
Protein Data BankEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKPDBsum provides an at-a-glance overview of every macromolecular structure deposited in the Protein Data Bank (PDB), giving schematic diagrams of the molecules in each structure and of the interactions between them.
Protein families database of alignments and HMMsWellcome Trust Sanger Institute, Hinxton, UKPfam is a large collection of over 8000 multiple sequence alignments and hidden Markov models covering many common protein domains and families. Each family in Pfam can be examined for multiple alignments, protein domain architectures and structures and species distribution.
Phosphorylation Site DatabaseDept. of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, GermanyPHOSIDA (PHOsphorylation SIte Database) allows retrieval of phosphorylation data of any protein of interest. It lists phosphorylation sites associated with particular projects and proteomes or, alternatively, displays phosphorylation sites found for any protein or protein group of interest.
PhosphoElmEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UK, Cellzome and othersThe Phospho.ELM database contains a collection of experimentally verified Serine, Threonine and Tyrosine sites in eukaryotic proteins. The entries, manually annotated and based on scientific literature, provide information about the phosphorylated proteins and the exact position of known phosphorylated instances.
PhosphoSiteCell Signaling Technology Company, Beverly, MA, USAPhosphoSite contains a very comprehensive list of many of the known human and mouse protein phosphorylation sites with very extensive supporting information.
Protein Information ResourceGeorgetown University Medical Center, Washington, DC, USAThe Protein Information Resource (PIR) is an integrated public bioinformatics resource to support genomic and proteomic research, and scientific studies. PIR has made many protein databases and analysis tools freely accessible to the scientific community.
Protein Mutant DatabaseCenter for Information Biology and DNA Data Bank of Japan National Institute of Genetics, Yata, JapanThe PMD covers over 81,000 natural as well as artificial mutants of proteins, including random and site-directed ones, for all proteins except members of the globin and immunoglobulin families. The PMD is based on literature, not on proteins. That is, each entry in the database corresponds to one article which may describe one, several or a number of protein mutants.
Prenylated Proteins DatabaseThe Bioinformatics Group at the IMP Vienna, AustriaPRENbase is an annotated database of known and predicted prenylated proteins. Homologous proteins are merged into clusters. This search interface is designed to allow sophisticated queries for the experimental status of the modification (known/predicted...), exclusive or shared types of modifying enzymes (FT, GGT1, GGT2) as well as for evolutionary conservation by constraining the taxonomic distribution within these clusters or for single sequences.
Proteomics IDEntifications DatabaseEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UK and Ghent University in BelgiumThe PRIDE PRoteomics IDEntifications database is a centralized, standards compliant, public data repository for proteomics data. It has been developed to provide the proteomics community with a public repository for protein and peptide identifications together with the evidence supporting these identifications. PRIDE is able to capture details of post-translational modifications coordinated relative to the peptides in which they have been found.
Protein Function DatabaseEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKidentify the likely biochemical function of a protein from its three-dimensional structure.
Database of Protein Families and DomainsEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKProsite is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs. Users can enter a protein sequence or find out the characteristic motifs of domains.
NCBI Reference Sequence ProjectNational Center for Biotechnology Information, Bethesda, MD, USARefSeq is a subset of Genbank records that represent reference sequence standards. RefSeq standards are a curated reference data set that avoid the redundancy found in Genbank and provide a stable reference point for mutation studies, expression analysis and homology investigations. RefSeq standards are found for over 3 million proteins, translated mRNAs and genomic data from over 4000 organisms.
ScanSiteMassachusetts Institute of Technology, Beth Israel Deaconess Medical Center, St. Jude's Children's Research Hospital, MA, USAScansite searches for motifs within proteins that are likely to be phosphorylated by specific protein kinases or bind to domains such as SH2 domains, 14-3-3 domains or PDZ domains.
Simple Modular Architecture Research ToolEuropean Molecular Biology Lab - Heidelberg, GermanySMART (a Simple Modular Architecture Research Tool) allows the identification and annotation of genetically mobile domains and the analysis of domain architectures. More than 500 domain families found in signalling, extracellular and chromatin-associated proteins are detectable. These domains are extensively annotated with respect to phyletic distributions, functional class, tertiary structures and functionally important residues. Each domain found in a non-redundant protein database as well as search param
SWISS-2DPAGE Two-dimensional polyacrylamide gel electrophoresis databaseSwiss Institute of Bioinformatics, Geneva, SwitzerlandSwiss 2D-PAGE contains data on proteins identified on various 2-D PAGE and SDS-PAGE reference maps. Proteins can be located on the 2-D PAGE maps or display the region of a 2-D PAGE map where one might expect to find a protein from Swiss-Prot.
Swiss Protein DatabaseSwiss Institute of Bioinformatics, Geneva, SwitzerlandSwissProt is a curated protein sequence database provided by the ExPASy (Expert Protein Analysis System) proteomics server of the Swiss Institute of Bioinformatics (SIB). It strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases.
The Human Gene Index ProjectThe Institute for Genomic Research, Rockville, MD, USAThe Gene Index Project uses the available EST and gene sequences, along with the reference genomes wherever available, to provide an inventory of likely genes and their variants and annotates these with information regarding the functional roles played by these genes and their products.
TIGR Protein FamiliesThe Institute for Genomics Research, Bethesda, MD, USATIGRFAMs are protein families based on Hidden Markov Models or HMMs. Use this page to see the curated seed alignmet for each TIGRFAM, the full alignment of all family members and the cutoff scores for inclusion in each of the TIGRFAMs. Also use this page to search through the TIGRFAMs and HMMs for text in the TIGRFAMs Text Search or search for specific sequences in the TIGRFAMs Sequence Search.
Transcriptions - The Music of Protein SequencesTexas Wesleyan University, Fort Worth, TX, USA“A Protein Primer” makes music from protein sequences by assigning increasing pitch to amino acids by their increasing hydrophobicity values and the duration of each note is set by the number of codons coding for it.
Universal Protein ResourceEuropean Molecular Biology Lab - European Bioinformatics Institute, Hixton, UKThe UniProt (Universal Protein Resource) for protein sequences and is the central hub for the collection of functional information on proteins, with accurate, consistent, and rich annotation, the amino acid sequence, protein name or description, taxonomic data and citation information. It is a central repository of protein sequence and function created by joining the information contained in UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, and PIR.