Influenza Research Database (IRD)
Influenza Research Database (IRD) is a US NIH/NIAID-funded, freely-available online bioinformatics resource for influenza virus data search, analysis and visualization. IRD is accessed ~1484 sessions per week (Google Analytics - 2016 average) and has been cited in more that 568 scientific publications.
Recently, several new influenza virus sequence annotation tools have been added to IRD, including: (1) a sequence auto-curation pipeline that checks for potential sequencing artifacts, (2) an H1 clade classification tool based on the USDA/OFFLU swine H1 classification scheme, (3) an H5 clade classification tool based on the CDC/WHO highly pathogenic avian influenza A H5N1 classification scheme, (4) a Sequence Feature Phenotypic Variant Type annotation tool based on the CDC H5 Genetic Changes Inventory, and (5) an HA subtype numbering conversion tool based on the cross-subtype HA numbering scheme proposed by Burke & Smith (PMID: 25391151). Additionally, a JavaScript based tree viewer Archaeopteryx-js and a user metadata capture utility have been developed and integrated into IRD.
Using the new annotation tools, influenza sequences in IRD have been comprehensively curated and annotated. Specifically, potential sequencing artifacts are flagged and users are provided with choices whether and how to include problematic sequences in their analyses. H1/H5 sequences are annotated with H1/H5 clade assignments. The presence/absence of Phenotypic Variant Types, in which particular sequence substitutions are predicted to give rise to phenotypic effects, have been computed for all influenza sequences in IRD. IRD users can also annotate their own sequences using any of these tools.
In addition to providing these expanded sequence annotations, IRD also now facilitates comparison of homologous residues between different HA subtypes using the new HA numbering conversion tool. This tool automatically converts the coordinates of user-provided and IRD-supported sequences into any other numbering schemes defined by selected reference strains.
Finally, IRD supports metadata-based comparative genomic analysis, such as phylogenetic tree coloring based on metadata values and metadata-driven Comparative Analysis Tool for Sequences (meta-CATS). Using the new Archaeopteryx-js tree viewer and the metadata capture utility, users can now upload their own sequence data with associated metadata to their personal Workbench space and subsequently analyze and visualize their sequence data and metadata along with IRD data using these analysis tools.
IRD provides comprehensive enriched influenza virus sequence annotations and supports custom sequence annotation, analysis and visualization as part of its mission to facilitate research and development of diagnostics, prophylactics and therapeutics for influenza viruses.
Funding
This work is funded by the National Institute of Allergy and Infectious Diseases (NIH/DHHS) under contract no. HHSN272201400028C.
