Research Areas
Computationally Uncovering Protein Roles |
While we now know the sequence of many organisms (including mice and humans) we still know relatively little about the roles played by the genes and proteins encoded within these sequences. Researchers have generated a great deal of data that can potentially shed light on to the functions and processes performed by proteins, however, these datasets are generally noisy, heterogeneous, and very large. Our group develops and applies machine learning and data mining techniques to these data that overcome these challenges in order to form highly confident predictions of protein roles. We then take these predictions back to the lab bench with our collaborators to confirm their validity.
Related publications:
- Hess DC, Myers CL, Huttenhower C, Hibbs MA, Hayes AP, Paw J, Clore JJ, Mendoza RM, Luis B, Nislow C, Giaever G, Costanzo M, Troyanskaya OG, Caudy AA. Computationally driven, quantitative experiments discover genes required for mitochondrial biogenesis. PLoS Genetics, 2009
- Hibbs MA, Myers CL, Huttenhower C, Hess DC, Li K, Caudy AA, Troyanskaya OG. Directing experimental biology: a case study in mitochondrial biogenesis. PLoS Computational Biology, 2009
- Myers CL, Robson D, Wible A, Hibbs MA, Chiriac C, Theesfeld CL, Dolinski K, Troyanskaya OG. Discovery of biological networks from diverse functional genomic data. Genome Biology, 2005
- Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics, 2007
- Guan Y, Ackert-Bicknell CL, Kell B, Troyanskaya OG, Hibbs MA. Functional genomics complements quantitative genetics in identifying disease-gene associations. PLoS Computational Biology, 2010
Search Algorithms and Data Organization |
Among our efforts in data mining, our group develops similarity search algorithms in order to investigate biological hypotheses and create community resources of data that are easily searchable. With the huge expansion of data generation that has occurred over the past years, it has become impossible for researchers to understand, or even examine, all of the data publically available. Our group develops algorithms and systems designed to organize all of this available data and provide intuitive and useful interfaces to this data in order for researchers to find the information they need.
Related publications:
- Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics, 2007
- Hibbs MA. The Effects of Pre-processing and Parameter Choices on Searches Through Large Gene Expression Data Collections. IEEE Int Conf on Genomic Signal Processing and Statistics (GENSiPs), 2009
Gene Expression Analysis |
Gene expression microarray technology has been responsible for much of the functional genomics data generated in recent years. These data promise to help researchers investigate the regulation and transcription of genes, which is vital for understanding the ultimate roles that proteins play within cells as well as developing diagnostic tests and finding new drug targets. Microarray data can be particularly difficult to analyze and comprehend due to unusual noise characteristics and variation between protocols and technologies. We are developing methods to harness large collections of microarray data that make it more accessible to researchers. Also, our approaches are readily adaptable to new technologies that can measure transcription, such as deep sequencing approaches.
Related publications:
- Huttenhower C, Hibbs MA, Myers CL, Troyanskaya OG. A scalable method for integration and functional analysis of multiple microarray datasets. Bioinformatics, 2006
- Hibbs MA, Hess DC, Myers CL, Huttenhower C, Li K, Troyanskaya OG. Exploring the functional landscape of gene expression: directed search of large microarray compendia. Bioinformatics, 2007
- Huttenhower C, Flamholz AI, Landis JN, Sahi S, Myers CL, Olszewski KL, Hibbs MA, Siemers NO, Troyanskaya OG, Coller HA. Nearest Neighbor Networks: clustering expression data based on gene neighborhoods. BMC Bioinformatics, 2007
- Hibbs MA, Wallace G, Dunham M, Li K, Troyanskaya OG. Viewing the Larger Context of Genomic Data through Horizontal Integration. 11th International Conference Information Visualization (IV07), 2007
- Hibbs MA, Dirksen NC, Li K, Troyanskaya OG. Visualization methods for statistical analysis of microarray clusters. BMC Bioinformatics, 2005
Large-Scale Data Visualization |
One of the best ways for researchers to understand their data is to visually look for patterns within that data. However, the scale of genome-wide datasets prevents traditional methods and devices from fully displaying these data. We are developing techniques that utilize large-scale display devices as well as traditional displays in order to show researchers the information that they need to extract from their data.
Related publications:
- Wallace G, Anshus OJ, Bi P, Chen H, Chen Y, Clark D, Cook P, Finkelstein A, Funkhouser T, Gupta A, Hibbs MA, Li K, Liu Z, Samanta R, Sukthankar R, Troyanskaya OG. Tools and applications for large-scale display walls. IEEE computer graphics and applications, 2005
- Hibbs MA, Wallace G, Dunham M, Li K, Troyanskaya OG. Viewing the Larger Context of Genomic Data through Horizontal Integration. 11th International Conference Information Visualization (IV07), 2007
- Hibbs MA, Dirksen NC, Li K, Troyanskaya OG. Visualization methods for statistical analysis of microarray clusters. BMC Bioinformatics, 2005
- Gehlenborg N, O'Donoghue SI, Baliga NS, Goesmann A, Hibbs MA, Kitano H, Kohlbacher O, Neuweger H, Schneider R, Tenenbaum D, Gavin A. Visualization of omics data for systems biology. Nature Methods, 2010

