ERGO(TM) ERGO FAQs Integrated Genomics
 
ERGO Overview ERGO Tutorial FAQs Support ERGO Publications
 
 


What is ERGOTM?
Where was ERGOTM developed?
How is ERGOTM different from other systems?
What can a user do with ERGOTM?
How can ERGOTM be used by pharmaceutical, agricultural and other industrial companies?
What type of data can be found in ERGOTM?
Can ERGOTM be used to predict functions in eukaryotes?
What is metabolic and functional reconstruction?
What is meant by a chromosomal neighborhood?
What are functional pathways and subsystems?
What is a meant by genomic clustering profiles?
How does ERGOTM identify proteins missing from various biochemical pathways?
How do I access an overview of a specific metabolic process?
Can I use ERGOTM if I don't have the complete sequence of my organism?
Can I integrate data from high throughput analysis such as microarrays, proteomics or genome-wide knockouts into ERGOTM?
Who should I contact if I have questions concerning ERGOTM?


 
What is ERGOTM?

ERGOTM bioinformatics suite is a tool designed for comprehensive genome analysis. ERGOTM integrates data from every level including genomic, biochemical data, literature, and high-throughput analysis into a comprehensive user friendly network of metabolic and nonmetabolic pathways. This network is then used to feedback and improve our functional assignments. Using a cross genome approach, ERGOTM extracts information from the latest sequence data found in over 600 genomes from both eukaryotes and prokaryotes. In contrast to conventional systems, ERGOTM takes into account similarity, genomic clustering profiles, chromosomal neighborhoods, expression data, and functional subsystems prior to assigning function to an ORF. Examining individual ORFs as well as entire genomes from multiple perspectives such as this allows investigators to see 'connections' and valuable information that is often missing when analyzing things from only one perspective. The cyclical nature of the integration of this new information continually elevates our knowledge and understanding of the complex dynamics living organisms.


 
Where was ERGOTM developed?

Emerging from PUMA and WIT which were developed at Argonne National Laboratories, Emerging from PUMA and WIT which were developed at Argonne National Laboratories, ERGOTM is a third generation bioinformatics suite offered exclusively by IG.


 
How is ERGOTM different from other systems?

While conventional systems annotate genes based principally on the results of similarity searches:

  • ERGOTM moves functional annotation to a higher level of accuracy by integrating multiple levels of information.
  • Through the use of both proprietary tools and publicly available resources, a researcher is able to analyze genes (or proteins) in a variety of contexts including but not limited to genomic clustering, chromosomal neighborhoods, expression profiles and functional subsystems.
  • Due to its diversity of data, pathways and analytical tools, ERGOTM is able to predict proteins that are functionally linked in a variety of situations such as metabolic pathways, signaling pathways, protein trafficking, and/or structural complexes.
  • Information is consolidated into an organism-specific profile.
  • ERGOTM is supported by a dynamic, ever changing database that strengthens with the introduction of every new genome.
  • Each new genome increases the knowledge in 3 dimensions: through the introduction of new proteins, new pathways and new patterns. This new knowledge is then spread vertically across throughout the database thereby strengthening the entire system.


 
What can a user do with ERGOTM?

  • Determine the function of a given sequence, EST or gene and integrate it into appropriate pathways
  • Investigate DNA sequences, proteins and/or pathways that are common to groups of user defined organisms
  • Characterize ORFs and ESTs using proprietary tools such as functional coupling, pinned regions and preserved operons
  • Characterize proteins using public tools such as Pfam, COGS, BLAST, PSI-BLAST
  • Compare individual annotations with public databases such as EcoGENE, Subtilist, Genequiz, DAtA (Database of Arabidopsis thaliana Annotation), TIGR, trEMBL, Swiss-PROT
  • Update functional assignments
  • Refine nomenclature of genes, proteins and pathways
  • Analyze metabolic and nonmetabolic models of various organisms
  • Identify unique proteins relative to over 600 other genomes
  • Analyze whole genomes between user defined subsets of genomes from over 600 organisms
  • Analyze microarrays data in the context of a gene, pathway, subsystem, functional network and the whole genome
  • Analyze genome-wide knock out data in the context of a gene, pathway and functional network


 
How can ERGOTM be used by pharmaceutical, agricultural and other industrial companies?

ERGOTM can be used to:

  • Develop prioritized lists of metabolic and nonmetabolic drug targets through effective data mining
  • Comprehensive evaluation of potential drug targets prior to evaluation through comparative analysis of functional networks
  • Decrease laboratory costs and increase productivity through use of integrated data analysis prior to bench analysis
  • Decrease valuable time-consuming analytical efforts through an integrated, easy-to-use workbench of tools
  • Increase IP potential through the discovery of novel enzymes, proteins and pathways
  • Increase production yields through knowledge-driven metabolic engineering
  • Increase understanding and interpretation of high throughput analysis through its data integration into functional networks


 
What type of data can be found in ERGOTM?

Currently, ERGOTM contains the following types of data:

  • Over 600 Genomes (full and partial)
  • Graphical visualization of Open Reading Frames (ORFs) on a contig
  • Functional assignments of proteins
  • RNA assignments
  • Identification and localization of insertion elements
  • Similarity scores and alignments of individual proteins relative to the rest of the database
  • Incorporation of assigned proteins into metabolic and nonmetabolic pathways
  • Metabolic and nonmetabolic pathways arranged as a hierarchy
  • Metabolic and nonmetabolic pathways integrated into a functional network
  • Operons Ortholog clusters
  • Functional clusters
  • Chemical structures
  • Enzyme records
  • Links to public databases (both organism specific and general)
  • Links to commonly used analytical tools such as Pfam, COGS, BLAST, etc.
  • Integration of microarray data
  • Graphical visualization of whole genome comparisons
  • Summary information and statistics on individual genomes


 
Can ERGOTM be used to predict functions in eukaryotes?

YES!!!

One of the strongest aspects of ERGOTM is its diverse database encompassing genomes from all domains of life (eukaryotes, archaea and bacteria). Due to its integration of biochemical pathways with genomic, microarray and wet lab data, ERGOTM visualizes chromosomal, biochemical, structural, or regulatory patterns not seen when viewed from only one perspective. These patterns help to figure out the 'missing pieces' that cannot be identified with conventional similarity searches. Quite often the patterns are not present in the organism of most interest to the investigator but a distant relative either in the same domain or across domains. ERGOTM takes advantage of this information and extrapolates it to genomes that do not exhibit the pattern, most notably the eukaryotes. This approach has been very effective in 'filling in' missing pieces of metabolism in the some eukaryotes since many metabolic pathways are shared between prokaryotes and eukaryotes.


 
What is metabolic and functional reconstruction?

Metabolic and/or functional reconstruction provides a 'blueprint' of both metabolic and nonmetabolic pathways that occur in an organism. This information is presented in a user friendly multilayered format with each layer representing a new level of information about the desired organism. At its most detailed level, the pathway is integrated with information about individual proteins participating in the pathways, their genes, and their presence or absence across other organisms in the database. Reconstructions are the foundation for future studies involving fluxes of compounds and energy, enzyme kinetics, regulatory mechanisms and spatial distributions of compounds within the cell. Currently they are used for metabolic engineering, discovery of novel enzymes, pathways, and drug targets. Derived from a combination of sequence, biochemical and phenotypic data, metabolic reconstruction includes such things as primary metabolism, secondary metabolism, components of DNA replication, repair, transcription, translation, signal transduction, protein secretion, membrane transport systems, protein processing and trafficking. A graphical presentation of this data can be accessed through the view models page of ERGOTM. The ability to create a metabolic reconstruction of an organism in silico can save the researcher months, if not years of wet-lab experimentation.


 
What is meant by a chromosomal neighborhood?

The chromosomal neighborhood is the region around the gene of interest. ERGOTM is designed to analyze the chromosomal neighborhood in two ways.

  • First, the researcher can examine operons that have been preserved between pairs of genomes. In order for an operon to be 'preserved', the candidates must be bidirectional best hits with the corresponding orthologs in the second genome. In otherwords, in the operon XY, protein X of genome 1 must be a bidirectional best hit with its ortholog (protein X*) of genome 2. Protein Y must be a bidirectional best hit with its corresponding ortholog in genome 2 (Y*). In addition X* and Y* must be in close proximity to each other on the same strand of DNA. When genes meet these criteria we consider them to be preserved operons.
  • Second, through the pinned regions tool a researcher can look at orthologs within a 2 to 20 kb region that are common between genomes in the database, irrespective of gene orientation. This graphic respresentation highlights proteins that are 'tagging along together' within a desired region. Potential orthologs are colored the same color so investigators are looking for common colors shared across organisms and sometimes domains. With prokaryotes, it is often the case that 'clustering proteins' are functionally linked (or appear in the same biochemical pathway).These tools have proven invaluable when annotating hypothetical proteins and finding missing genes in a pathway.


 
What are functional pathways and subsystems?

Just as the use of genome clustering and gene proximity can improve the prediction of protein function, so does the application of functional pathways and subsystems. At Integrated Genomics, a pathway is defined as a set of proteins involved in successive or related biochemical, metabolic, structural or functional reactions. Pathways that have related substrates and/or intermediates are grouped together into a functional subsystem (examples of subsystems include amino acid metabolism, protein secretion, DNA replication and repair). Clustering proteins based on their pathways allows investigators to see complete as well as incomplete pathways. Since these pathways comprise functionally linked proteins, a cluster of proteins based on their subsystem markedly improves the accuracy of our functional assignments.


 
What is a meant by genomic clustering profiles?

Using ERGOTM, a researcher can cluster proteins common to a user defined subset of genomes from the database irrespective of their position on the phylogenetic tree. For example, you can find proteins that are common to pathogens but not found in nonpathogenic organisms, specific to humans, Drosophila, C. elegans, other eukaryotes, archaea or prokaryotes. You can also create a Venn diagram for any subgroup of organisms within the database. For example, a researcher may be interested in proteins that are found in a group of closely related organism (eg. the Staphylococci) as well as those proteins that are found in one but not the remaining members of your group. This analysis is invaluable when examining candidates for horizontal transfer, those involved in pathogenesis, or generating pools of drug targets or hypothetical proteins.


 
How does ERGOTM identify proteins missing from various biochemical pathways?

ERGOTM's strength is its ability to identify missing proteins involved in biochemical pathways. Since ERGOTM integrates both genomic and pathway data from over 600 different organisms investigators see patterns of proteins 'sticking together' that function in the same pathway. ERGOTM's proprietary tools visualize these patterns not only locally within a 20 kb region but also across the entire genome to enable investigators to see 'extra proteins'. Many times these 'extra proteins' are of unknown function are even misannotated. Further analysis reveals that they have characteristics you would expect to see in the missing enzymes and/or proteins of a pathway thereby allowing the investigator to hypothesize on the function of these proteins. It is significant to note that the number of patterns increases significantly with the introduction of each new genome into the database.


 
How do I access an overview of a specific metabolic process?

If you are looking for an overview of the metabolism of a specific organism then you need to go the the view models page. From here, you can access this information two ways:

  • graphically by clicking on the designated squares next to organisms name or designation
  • in a tabular format by clicking on the appropriate square next to the organism of interest.

If you want to find information about the presence or absence of enzymes of a specific pathway in multiple organisms, then you need to go through the specific pathway page via either general search or through an overview. From here you can see organisms in which the pathway has been asserted (or suggested to occur in) by the plus sign next to the specific organisms. Specific information about the pathway in all organisms is found under the see assertions button on the top right hand portion of the pathway page.


 
Can I use ERGOTM if I don't have the complete sequence of my organism?

YES!
Since ERGOTM consolidates information from many organisms and applies it across genomes, it can be used to annotate incomplete as well as complete genomes. Most of the proprietary and public tools can be used on any data set entered into ERGOTM. Typically we require a minimum of 100 kb of total sequence prior to incorporation into ERGOTM. Obviously the completeness of the reconstruction is dependent upon the amount of available data for the given organism.


 
Can I integrate data from high throughput analysis such as microarrays, proteomics or genome-wide knockouts into ERGOTM?

Yes!
We have written algorithms that will allow users to integrated data from common microarray analysis packages and integrate them into ERGOTM and vice versa. We are currently working on algorithms to integrate proteomics data as well.


 
Who should I contact if I have questions concerning ERGOTM?

  • For questions concerning pricing and availabity contact:
    Headquarters
    Chicago, IL USA
    Phone: 1-312-491-0846
  • For questions about using ERGOTM in general please contact Dr. Anamitra Bhattacharyya


 
Last update:  June 25, 2007
Webmaster
 
ERGO family:
 
 ERGOTM bioinformatics suite is property of Integrated Genomics Inc.

IG is providing access to the ERGOTM through fee-based subscription
 
  Publicly available version of ERGOTM

The server and the associated data are free of any charge for academic and non-commercial use only
 
  Publicly available version of ERGOTM by University of Minnesota with Integrated Genomics