NIH Stimulus money: what is in it for Bioinformatics?

Credit: n4sim on Flickr

Credit: n4sim on Flickr

Following  Shirley Wu’s excellent post on the stimulus money at the NIH, I decided to do my bit, and post some bioinformatically relevant programs from the  Challenge Grants. I am defining bioinformatics rather narrowly here, and excluding most biomedical informatics, imaging technologies, clinical data management, etc. Also, many other topics would be supported to some extent with bioinformatics. I picked those that seem to have bioinformatics as the core applicable field.

(03) Biomarker discovery and validation

03-AA-103        Molecular Markers of Alcohol-induced Tissue Injury. High-throughput
bioinformatic investigations of alcohol’s impact on, for example, the epigenome,
transcriptome, proteome, metabolome, etc. are needed to inform our understanding of themechanisms involved in alcohol-induced injury to adult and fetal tissues. Additionally, these approaches have the potential to reveal candidate biomarkers of alcohol-induced
pathology and alcohol exposure. Research is sought to develop diagnostic biomarker
signatures of alcohol consumption and alcohol-induced organ damage. Contact: Dr. Dale
Hereld, 301-443-0912, or Dr. Kathy Jung, 301-443-8744,

03-CA-106 Utilizing data from the TCGA and TARGET projects to support a large scale bioinformatics effort to identify biomarkers that lie within a pathway or are epi-pathway indicators of tumor formation or progression. Epi-pathway markers lie outside of typical pathways but can be identified as indicators when statistically significant numbers of tumors are characterized as is being done in these projects. Potential markers would be validated under other funding mechanisms. Contact: Dr. Joseph Vockley, 301-435-3881,

(06) Enabling Technologies

06-GM-103* Development of predictive methods for molecular structure, recognition, and ligand interaction. Studies to more precisely predict molecular structure and interactions between molecules and ligands to lay the foundation for a new generation of therapeutics and drug design. Powerful predictive methods will require the acquisition of experimentally derived constraints and breakthrough computational methods. Reliable, high-throughput predictive methods would create a more comprehensive resource for understanding molecular interaction that would eventually replace the use of slower, empirical determinations. Contacts: Dr. Peter Preusch, 301-594-0828,; Dr. Warren Jones, 301-594-3827,

06-HG-101* New computational and statistical methods for the analysis of large data sets from next-generation sequencing technologies. The introduction of new methods for DNA sequencing has opened new avenues, including large-scale sequencing studies, metagenomics, transcriptomics, genetic network analysis, and determination of the relationship of sequence variation and phenotypes to disease, to address heretofore unapproachable problems in biomedical research. However, since the large amounts (terabases) of data generated overwhelm existing computational resources and analytic methods, urgent action is needed to enable the translation of this rich new source of genomic information into medical benefit. Contact: Dr. Lisa Brooks, 301 496-7531,

06-GM-114 Microbial sequence annotation. Development of new approaches to the rapid and comprehensive annotation of microbial sequences resulting from metagenomics and other high-capacity outputs. Approaches may combine high-throughput experimental methods with innovative data mining algorithms and model building. Contact: Dr. James Anderson, 301-594-0943,

06-GM-115 High-end computing software. Upgrading of biomedical computing software to high-end computing (HEC). This developmental effort will seek to expand the domain areas to the macromolecular, cell, tissue, organ, whole-organism, and population levels. The program would support grants to upgrade and port software to run and perform experiments on new generation HEC supercomputers. Contact: Dr. Peter Lyster, 301-594-3928,

06-AG-106 New computational and statistical methods for the analysis of large data sets from genome-wide association studies (GWAS) and the use of next-generation sequencing technologies. Develop new tools to enable the translation of vast amounts of genomic information into medical benefit to address large amounts of data generated (e.g., terabases of sequence) that overwhelm existing computational resources and analytic methods. These new approaches include very large-scale genotyping and sequencing studies, metagenomics, transcriptomics, and genetic network analysis. Contact: Dr. Marilyn Miller, 301-496-9350,

06-CA-110 In Silico Cancer Drug Medicine. For years, researchers have explored the myriad wonders of the construction of virtual proteins based on gene and protein sequence alignments and the screening of virtual compounds against a database of drug targets. But as is so often the case in drug development, most of these virtual compounds fail to achieve their lofty goals when synthesized and exposed to the harshness of the real world and the complexity of the human body. This obstacle now negatively impacts translation of new chemical entities into the market. Today, an opportunity exists for the NIH to implement a concerted effort that develops transformative tools (virtual and physical) that test drugs in real-world scenarios, while still in the virtual phase of human physiology. Contact: Dr. Henry Rodriguez, 301-496-1550,

06-CA-111 Integrative analysis of genomic data sets generated by TCGA and TARGET. Methods for the unsupervised analysis of large and varied data sets that are predictive of cancer formation and can determine regulatory points in pathways and circuits. Contact: Dr. Joseph Vockley, 301-435-3881,

06-CA-112 Development of high throughput mechanisms for genomic analysis. This includes methods to improve the throughput of next gen methods for genomic analysis. Methods could be either laboratory based or bioinformatics based improvements with the goal of decreasing the amount of time it takes to analyze a sample. Contact: Dr. Joseph Vockley, 301-435-3881,

06-DA-109 Developing new computational approaches to Information Retrieval. Development of computational approaches which query multiple data sources and types relevant to basic neuroscience and behavioral addiction research, and which (1) employ or add to the Neurolex vocabulary ( of the NIH Blueprint Neuroscience Information Framework and (2) focus on enabling user-friendly complex queries based on concepts, anatomical coordinates, and other query parameters relevant to addiction research, that return source data elements directly within a format and context that makes them easily interpreted and accessible. Contact: Dr. Karen Skinner, 301-435-0886,

06-GM-101* Structural analysis of macromolecular complexes. Development of new approaches, technologies, and reagents that would facilitate functional and/or structural analysis of macromolecular complexes. Contacts: Dr. Ravi Basavappa, 301-594-0828,; Dr. Laurie Tompkins, 301-594-0943,

06-GM-107 Metal ion binding and function. Development of high-throughput methods for the prediction of metal ion binding and function in proteins at the structural, redox, and/or catalytic levels. Contact: Dr. Vernon Anderson, 301-594-3827,

06-HL-108 Develop new informatics techniques for integrative analysis of genomic and epigenomic data. Much of the complex interplay between genetic and environmental risk factors for disease likely occurs through the interactive regulation of gene expression by both genotype and epigenetic markings of the genome. Epigenetic tags such as cytosine methylation and histone tail modifications, which modulate chromatin structure and function thereby affecting gene expression, are associated with environmental toxicities and are well documented. An integrated analysis of gene expression regulation, with simultaneous consideration of both genetic and epigenetic characteristics and of the interactions between these factors, is essential for understanding the complex pathobiology of chronic heart, lung, and blood diseases. New computational and informatics techniques are needed to allow such analyses. Contact: Dr. Robert Smith, 301-435-0202,

(The one below is not exactly bioinformatics, but if you are wasting time on Second Life and Web 2.0 in general,  why not do something for humanity and get the NIH to pay you?)

06-RR-101* Virtual environments for multidisciplinary and translational research. Virtual networking environments like Science Commons, Facebook, and Second Life, create platforms that can eliminate many barriers in scientific collaborations. These environments integrate fragmented information sources, enable “one-click” access to research resources, and assist in re-use of scientific workflows. Funded projects would develop and implement virtual collaborative environments to facilitate biomedical and translational research, e.g. addressing issues of privacy, technology transfers, and sharing resources. Contact: Dr. Olga Brazhnik, 301-435-0758,; NIDA Contact: Dr. David Thomas, 301-435-1313,

(08) Genomics

08-AI-101 Explore the utilization and integration of available “omic” datasets to assess pathogen-host biological networks: Challenge Grant studies in this area can facilitate alternative and innovative approaches for the development of new prevention and therapeutic options. Contact: Dr. Valentina Di Francesco, 301-496-1888,

08-CA-107 Bioinformatic pipeline for rapid genomic analysis. Development of bioinformatics tools and analytical pipelines that will significantly decrease the amount of time it takes to analyze data from TCGA, TARGET and other high throughput projects. Contact: Dr. Joseph Vockley, 301-435-3881,

08-DA-102 Improved Bioinformatics Analysis for Deep Sequencing. The current estimate of sequencing an entire human genome is $5000 and can be accomplished in a few months. However, current bioinformatic and analytic capabilities are inadequate to analyze the volumes of data that would be generated by deep sequencing many individuals. Specifically, RC1 applications are sought to (1) optimize base calls from next-generation sequencing machines, (2) develop and improve optimal alignment/mapping methods that tackle uncertainty and multiple potential placements, (3) identify methods for SNP calling from multiple reads and multiple samples, (4) identify copy-number variation calling from next-generation sequencing data, and (5) develop automated methods for searching sequence databases that could be used to give probabilities that a variant is real. Contact: Dr. Jonathan D. Pollock, 301-435-1309,

08-DK-102 Beyond GWAS. Use methods such as ‘deep’ sequencing, exon sequencing, high-throughput genotyping and comparative genome hybridization to identify structural variations to pinpoint causal variants associated with NIDDK-relevant diseases or phenotypes, especially those identified in GWAS. Contact: Dr. Rebekah Rasooly, 301-594-6007, (Note: GWAS: Genome Wide Association Studies).

08-HL-102 Develop methods to integrate and analyze data from two or more different ‘omics approaches (e.g., GWAS, sequencing, epigenetics, metabolomics, transcriptomics) to capitalize on existing heart, lung, and blood data sets. Considerable resources have been expended in developing ‘omics technologies and applying them to heart, lung, and blood studies. However, the diverse ‘omics technologies each generate multiple data types. Limitations in our ability to combine and analyze data across various ‘omics studies have constrained their use in efforts to elucidate the molecular mechanisms underlying heart, lung, and blood disorders. To obtain full value from those data will require new and improved tools to:

  • Integrate data across two or more ‘omics data sets.
  • Analyze integrated data sets using improved statistical tools and approaches necessary to handle the challenges inherent in the complex integrated data sets.

Contact: Dr. Deborah Applebaum-Bowden, 301-435-0513,

08-OD-101 Computational approaches for epigenomic analysis. Technologies such as ultra-high-throughput sequencing allow one to perform epigenomic analyses that were previously impossible. However one of the major remaining challenges is the lack of effective tools for the analysis and integration of epigenomic data. The development of computational or statistical tools to analyze epigenomic data and integrate it with other data types (multiple epigenetic marks, gene expression data, DNA sequence, comparison to diseased cell types etc) would allow epigeneticists to overcome this challenge and make it significantly easier for researchers to investigate the epigenomic basis of disease states. Contact: Dr. Joni Rutter (NIDA), 301-435-0298,

(10) Information Technology for Processing Health Care Data

Everything here is biomedical, some bioinformatic, hard to say where one ends and the other begins. Look for yourselves. Lots of database integration and other infrastructure work.

If you are doing ontologies, this one seems like a great one to look at:

10-CA-103 Cell Behavior Ontology. Descriptions of various processes and behaviors of cells are still crudely described and quantified. This type of description makes it difficult to compare and integrate this type of research into various aspects of biological research. Approaches and nomenclatures are desperately needed to better understand, describe, and utilize the vast amount of information about these critical processes in the transforming environment. Contact: Dr. Jerry Li, 301-435-5226,

(15) Translational Science

15-AR-102 Link Genomics, Proteomics, Bioinformatics and Systems Biology To Clinically Relevant Outcomes in Autoimmune Diseases. The objective is to develop new, cost effective and accurate tools that will be used to predict, prevent and monitor autoimmune diseases. Define assays that are effective at monitoring disease activity and that predict the development of specific complications. Contact: Dr. Susana Serrate-Sztein, 301-594-5032,

15-OD-102 Analysis of PubChem data sets. The Molecular Libraries Probe Production Centers Network (MLPCN) implements high throughput screens for a number of biological targets and develops probe compounds from the results. The emphasis is on finding useful probes for a wide variety of targets rather than on an in depth investigation of each target or the interactions between them. The NIH will support projects based on MLPCN data available through Pub Chem ( that combine informatics, chemical synthesis and non-high-throughput biological testing to enable the scientific community to take full advantage of the ML resources. Contact: Dr. Ajay (NHGRI), 301-594-7108,

Share and Enjoy:
  • Fark
  • Digg
  • Technorati
  • StumbleUpon
  • Facebook
  • Reddit
  • Twitter
  • FriendFeed
  • PDF
  • email
  • Print
  • Google Bookmarks

2 Responses to “NIH Stimulus money: what is in it for Bioinformatics?”

  1. shwu says:

    I’m sure this will be very helpful for many, thanks for trawling through all those topics, Iddo!

  2. Iddo says:

    Well, I was trawling anyhow, as I am submitting one or two proposals and wanted to see if I didn’t miss any collaborative opportunities. Halfway through cutting and pasting candidate topics from the RFA, I thought it might be a good idea to cut and paste to my blog, rather than just for myself.