Visualization Data Mining Analysis Suite Custom Motif

What is Athena?
Athena is a web-based application that warehouses disparate datatypes related to the control of gene expression.
Accompanying this warehouse is a large set of data visualization, mining, and analysis tools.
Our tools are further supplemented through the collaboration of BBC @ University of Toronto, who supply gene expression data mining and motif finding.

What can Athena do?
Athena provides several features to enable exploration of the regulatory mechanisms of Arabidopsis gene control.
  • Visualization The first main tool we provide is visualization of promoter domains of selected genes. Shown in these images are indication of each gene's transcript start, translated region, as well as designation of all known putative transcription factor binding sites. Database crossreference for these transcription factors is provided as well as a statistical test for enrichment of binding activity within the set of selected promoters.

    Genes selected in this tool are exportable to the analysis suite as a genome subset to investigate.
  • Data Mining The data mining tools in Athena allow for selection of sets of genes based on two different factors.
    • Genes can be select by specifing a set of bidning factors whose putitive sites must be present within all of those genes' promoter regions.
    • Alternatively, genes can be selected using Gene Ontology annotations. Both GO (Gene Ontology) Slim terms and Gene Ontology terms are available. One can select a set of genes by either choosing a union of the genes annotated by a selected set of Slim terms or Gene Ontology terms.
    The selected genes's putative binding factors are listed, including enrichment data. Furthermore, enriched presence of Gene Ontology terms is given
  • Genes selected in this tool are exportable to the analysis suite as a genome subset to investigate.
  • Analysis Suite The analysis suite provides both enhaced data mining tools for selecting genes as well as several data displays.
    • Genes are selectable based upon genome subsets, shared sets of binding factor sites, or Gene Ontology annotations, or any combination of the three. Furthermore, Expression Angler from BBC at the University of Toronto can select sets of coregulated genes which can be exported to this tool with one click.
    Several data analysis displays are currently available with several more forthcoming or in testing.
    • Histograms are available that plot the distribution of binding factor sites for a set of binding factors and selected promoters. This distribution can also be tested for bias versus a random background model.
    • Distributions of individual binding factor sites across a set of selected promoters is also available. These distributions can also be tested for enrichment of binding activity over the background activity observed in the whole genome.
    • Pattern discovery is in the testing phase and will enabled determination of enriched pairwise distances between two selected binding factors in a set of selected promoters. Such an enriched pairwise distance is subsequently a basis for putative binding factor structural elements such as patters and motis of binding factor binding sites.
    • Promomer motif discovery is available through external link that exports either selected promoters or all found promoters to this tool provided by the BBC at Toronto. Promomer indicates enriched n-mers identified within a set of promoters.
    • MEME motif finding is in the testing phase. This motif finding will enable a secondary approach to identifying novel binding factor motifs.
    • Motif listing is available, which provides a textual output listing all putative binding sites of all selected binding factors within a set of selected promoters.
    • Frequency data is also available which lists each selected binding factor's frequency and enrichment within the set of selected promoters as well as background frequency observed genome-wide.
    • Chromosomal plots are forthcoming and will plot the distribution of the selected genes across the 5 Arabidopsis chromosomes and test for statistical enrichment within specific chromosomes as well as positional bias of locations.
    • CpG enrichment plots are also forthcoming and will plot both the expected and observed distributions of transcription factor binding sites within and outside of CpG island regions.
    In addition to the analyses available, subselection of promoter sets is availble based upon the presence of a specific transcription factor binding site or Gene Ontology annotation.

Referencing us
The manuscript related to this website can be referenced as:
O'Connor, T.R., Dyreson, C., and Wyrick, J.J. (2005) "Athena: a resource for rapid visualization and systematic analysis of Arabidopsis promoter sequences." Bioinformatics. In press.
The paper is currently in press as an application note in Bioinformatics. When volume and issue number are known, they will be added to the page.
Contacting us
The following are the people connected to the Athena project:

Tim O'Connor: Main developer
John Wyrick: Lab PI
Monique Kohagura: Programmer
For most questions/problems, questions about tools, and ideas/comments for improving Athena, please contact Tim O'Connor. Contact John Wyrick for questions about future directions and biology. Monique Kohagura is currently in charge of maintaining the Athena code.

Updated: 1996: PM, 28 Feb 2005 © 2005 Timothy Ryan O'Connor. All rights reserved Contact: John Wyrick
675 Fulmer Hall
Help Pullman, WA 99164 Recommended Browser: Mozilla Firefox
Reference Publication