Key Biological Questions that the CMI-Flu project plans to address
The central hypothesis of the CMI-Flu (Computational Models of Immunity - Influenza Vaccination) project is that computational models can accurately predict the breadth and durability of influenza vaccine responses by integrating key immunological factors such as innate immune activation, antigen-specific B and T cell responses, prior exposure history, and intrinsic genetic factors.
Specifically, we aim to answer:
- How does pre-vaccination immune state influence the magnitude and durability of antibody responses?
- What factors determine the breadth of protection against drifted influenza strains?
- Can vaccine-specific antibody clonotypes be predicted from the baseline B cell repertoire?
- How do CD4+ T cell responses shape antibody outcomes?
The project will capture both existing influenza vaccine data and newly-generated experimental data, thereby providing a robust foundation for iterative model development. Models will be rigorously evaluated through an annual public prediction contest.
For information about influenza infection and vaccination, please visit the Learn about Flu Page.
Making influenza vaccine data accessible to all
Navigating multiple databases and harmonizing data across studies with different readouts is challenging, yet it offers benefits in terms of model prediction power and robustness. Our goal within CMI-Flu is to simplify access to standardized influenza vaccine data from numerous studies. Alongside our own data, a CMI-Flu central database will feature other public data, harmonised to allow cross-study integration and comparison.
Measuring of durability and breadth
While vaccines for yellow fever or smallpox provide long-term, durable protection from a single shot, current influenza vaccines must consistently be boosted to retain protection. Many influenza studies focus exclusively on the peak response one month post-vaccination as a readout of vaccine immunogenicity. In contrast, our data will assess vaccine-induced antibody immunity at multiple post-vaccination time points including (Figure 1): day 28 - corresponding to the peak response; day 90 - roughly the span of one influenza season, and also a time at which germinal center reactions can still be detected; and day 365 – where detected responses are expected to reflect long-term immune memory. Our primary readout defining durability will be the neutralizing titers of antibodies against the vaccine strains on day 365 - 12 months post-vaccination. As the seasonal influenza vaccine contains four strains (H1N1, H3N2, B Victoria, and B Yamagata), we will consider the durability of each vaccine strain separately. Of note, we will also have HAI titer data for a broader panel of influenza strains on day 365, which will serve as a secondary readout for durability. Given the rapid evolution of influenza viruses, the annually redesigned vaccine should ideally provide extensive cross-protection against drifted viral strains. We will assess the breadth of vaccine-induced immune responses by measuring HAI titers against panels of historical influenza A H1N1 and H3N2 viruses comprising the key lineages from the past decade.
Collecting readouts across the immune response
Influenza (vaccine) exposure history. We will use accurate vaccination records from the past 3 years, clinical questionnaires on infection and vaccination history, and participant age to infer lifetime influenza strain exposure.
Genetic determinants. All subjects will be examined for SNPs using arrays measuring ~1.8 million common human SNPs, to evaluate known SNPs that impact immune responses. We will also perform targeted sequencing to determine the HLA type of each subject, which determines the T cell epitopes available to recognize different viral strains and ultimately drives T cell immunogenicity and cross-reactivity. Finally, we will genotype the IG and TR loci that affect the immune receptor repertoires.
(Innate) Immune system state. All subjects will be characterized in terms of their immune state before and immediately after vaccination (days -14, 0, and 1) to assess the immune environment during vaccine priming and the immediate innate response. We will characterize their transcriptomic state by RNA-Seq, cellular composition by flow cytometry, and cytokine profiles using multiplex flow cytometry.
CD4+ T cell responses. Samples from all subjects will be characterized for their influenza-specific CD4 T cell responses using Activation Induced Marker (AIM) assays against pools of influenza peptides. Different peptide pools will compare responses against conserved epitopes found in recently circulating strains with those unique to the current vaccine strains. We will record the frequency of AIM+ cells along with their phenotype (assessed by surface markers and the expression of cytokines). In addition, we will perform TCR repertoire sequencing on days 0 and 28 to assess how the overall TCR landscape changes with vaccination. Finally, we will perform single-cell RNA- and TCR-Seq on AIM+ cells at days 0 and 28 to determine which TCRs are vaccine-specific, and connect them with transcriptional phenotypes.
Antibody responses that cover breadth and durability. We will perform BCR repertoire sequencing on days 0 and 7 to assess the global changes in the circulating B cell repertoire after vaccination. In combination with single-cell RNA- and BCR-Seq of plasmablasts on day 7 post-vaccination, where the vast majority of circulating cells are expected to be vaccine-specific, we will track influenza-specific clonotypes, their transcriptomic profile, and their maintenance over time. In addition, samples will be characterized for their influenza-specific antibody responses using HAI assays against a panel of ≈5 H1N1, ≈5 H3N2, and ≈5 influenza B strains representing historic viruses circulating since 2009. These HAIs will determine the breadth of the antibody response pre- and post-vaccination. We will utilize neutralization assays (that are more comprehensive but also more resource-intensive) to measure the durability of the antibody response against each vaccine strain.
Data will be collected over a four-year time period, allowing us to capture repeated exposures in the same participants over multiple flu seasons, at critical time points pre- and post-vaccination. Creating this deeply characterized cohort and complementary database of influenza studies will allow us and others to create a broad swath of computational models that will quantify the relative importance of different immune features in predicting vaccine outcomes. All computational models developed for this grant will be judged based on how accurately they can predict breadth and durability in this cohort.
Figure 2. The immune response to influenza vaccination comprises a cascade of events. Each immune process controlling the vaccine response is represented in a distinct color. Assays to measure each process are listed with bullet points.
Generating models of immunity
We will adapt and refine models developed by our center investigators to answer questions relating to influenza immunity, for instance, How does the pre-vaccination immune state drive antibody responses and their durability? What host features dictate the breadth of the antibody response measured by HAI titers? Can vaccine-specific antibody clonotypes be predicted based on the BCR repertoire before vaccination? Can T cell response predictions improve predictions of vaccine-induced antibodies? We will also implement existing published models from outside groups. By integrating models to consider multiple factors influencing the vaccine response, using both ‘black-box’ machine learning models and ‘mechanistic’ models that reflect individual steps in the vaccine response, we aim to generate more accurate and interpretable predictions of individual immune responses. All models will be rigorously tested in open annual prediction contests.
Annual prediction challenge
Computational models capable of forecasting an individual’s immune response to the influenza vaccine can provide mechanistic insights and pave the way for personalized vaccination strategies. We invite researchers to submit models built on the provided data, alongside our own models, to a prediction challenge. The CMI-Flu Prediction Challenge is designed to rigorously evaluate these computational models using independent datasets, mirroring successful strategies employed in areas like protein structure prediction.
This CMI-Flu prediction challenge follows our previous series, the CMI-PB challenge, which focused on Bordetella Pertussis vaccination. We found that the most successful models stood out in their handling of nonlinearities, reducing large feature sets to representative subsets, and advanced data preprocessing. In contrast, we found that models adopted from literature that were developed to predict vaccine antibody responses in other settings performed poorly, reinforcing the need for purpose-built models. Overall, the CMI-PB challenge series demonstrated the value of purpose-generated datasets for rigorous and open model evaluations to identify features that improve the reliability and applicability of computational models in vaccine response prediction. Both the CMI-PB and CMI-Flu challenge series are part of CMI-X.

Building a Collaborative Community for Influenza Research
Through these challenges, the CMI-X prediction series hopes to foster a collaborative research community, addressing challenges and advancing scientific knowledge more rapidly than any individual or research group could achieve alone. Throughout the prediction challenge, we invite discussion of the influenza immunity, data, and modelling processes. Participants will be invited to contribute to future CMI manuscripts, in addition to providing them the freedom to publish their models independently. Reproducible code will accompany every model to promote transparency and provide the groundwork for other researchers to test and further develop these models (complemented by a large resource of potential training data through the central CMI-Flu database) fostering a cycle of scientific advancement.
To target the next generation of biomedical and data scientists, we will also develop open-access teaching materials for reuse by the wider educator community. This will include a lecture on the historical and biological significance of the influenza virus and vaccine development, a hands-on lab to explore the data, a 3- to 4-week-long project tackling the prediction tasks, and a capstone project.
Impact, Future Directions, and Resources
Through this open, transparent, and quantitative process of generating and publishing experimental data, coupled with building and evaluating computational models of influenza vaccination-induced immunity, we aim to measure how well computational models predict vaccine breadth and durability. This work not only helps us to understand the human immune response but also to identify potential vaccine targets and promising candidates, which will further inform vaccine development.
We purposely push to make these models generalizable in the hope that these types of analyses and tools can be extended and applied to other diseases as well, ultimately driving advances that improve public health globally.
Last updated: Sept. 11, 2025, 1:11 p.m.