<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD with MathML3 v1.2 20190208//EN" "JATS-archivearticle1-mathml3.dtd"> <article xmlns:ali="http://www.niso.org/schemas/ali/1.0/" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.2"><front><journal-meta><journal-id journal-id-type="nlm-ta">elife</journal-id><journal-id journal-id-type="publisher-id">eLife</journal-id><journal-title-group><journal-title>eLife</journal-title></journal-title-group><issn publication-format="electronic" pub-type="epub">2050-084X</issn><publisher><publisher-name>eLife Sciences Publications, Ltd</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="publisher-id">79254</article-id><article-id pub-id-type="doi">10.7554/eLife.79254</article-id><article-categories><subj-group subj-group-type="display-channel"><subject>Research Article</subject></subj-group><subj-group subj-group-type="heading"><subject>Immunology and Inflammation</subject></subj-group></article-categories><title-group><article-title>Memory persistence and differentiation into antibody-secreting cells accompanied by positive selection in longitudinal BCR repertoires</article-title></title-group><contrib-group><contrib contrib-type="author" equal-contrib="yes" id="author-275640"><name><surname>Mikelov</surname><given-names>Artem</given-names></name><contrib-id authenticated="true" contrib-id-type="orcid">https://orcid.org/0000-0002-1629-2373</contrib-id><xref ref-type="aff" rid="aff1">1</xref><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="aff" rid="aff3">3</xref><xref ref-type="fn" rid="equal-contrib1">†</xref><xref ref-type="fn" rid="con1"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" equal-contrib="yes" id="author-275641"><name><surname>Alekseeva</surname><given-names>Evgeniia I</given-names></name><xref ref-type="aff" rid="aff1">1</xref><xref ref-type="fn" rid="equal-contrib1">†</xref><xref ref-type="other" rid="fund2"/><xref ref-type="fn" rid="con2"/><xref ref-type="fn" rid="conf2"/></contrib><contrib contrib-type="author" id="author-62208"><name><surname>Komech</surname><given-names>Ekaterina A</given-names></name><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="aff" rid="aff3">3</xref><xref ref-type="fn" rid="con3"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-181947"><name><surname>Staroverov</surname><given-names>Dmitry B</given-names></name><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="fn" rid="con4"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-275642"><name><surname>Turchaninova</surname><given-names>Maria A</given-names></name><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="fn" rid="con5"/><xref ref-type="fn" rid="conf2"/></contrib><contrib contrib-type="author" id="author-237144"><name><surname>Shugay</surname><given-names>Mikhail</given-names></name><contrib-id authenticated="true" contrib-id-type="orcid">https://orcid.org/0000-0001-7826-7942</contrib-id><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="aff" rid="aff3">3</xref><xref ref-type="fn" rid="con6"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-233804"><name><surname>Chudakov</surname><given-names>Dmitriy M</given-names></name><contrib-id authenticated="true" contrib-id-type="orcid">https://orcid.org/0000-0003-0430-790X</contrib-id><xref ref-type="aff" rid="aff1">1</xref><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="aff" rid="aff3">3</xref><xref ref-type="other" rid="fund1"/><xref ref-type="fn" rid="con7"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" id="author-85512"><name><surname>Bazykin</surname><given-names>Georgii A</given-names></name><contrib-id authenticated="true" contrib-id-type="orcid">https://orcid.org/0000-0003-2334-2751</contrib-id><xref ref-type="aff" rid="aff1">1</xref><xref ref-type="aff" rid="aff4">4</xref><xref ref-type="fn" rid="con8"/><xref ref-type="fn" rid="conf1"/></contrib><contrib contrib-type="author" corresp="yes" id="author-275643"><name><surname>Zvyagin</surname><given-names>Ivan V</given-names></name><contrib-id authenticated="true" contrib-id-type="orcid">https://orcid.org/0000-0002-1769-9116</contrib-id><email>izvyagin@gmail.com</email><xref ref-type="aff" rid="aff2">2</xref><xref ref-type="aff" rid="aff3">3</xref><xref ref-type="fn" rid="con9"/><xref ref-type="fn" rid="conf1"/></contrib><aff id="aff1"><label>1</label><institution-wrap><institution-id institution-id-type="ror">https://ror.org/03f9nc143</institution-id><institution>Skolkovo Institute of Science and Technology</institution></institution-wrap><addr-line><named-content content-type="city">Moscow</named-content></addr-line><country>Russian Federation</country></aff><aff id="aff2"><label>2</label><institution-wrap><institution-id institution-id-type="ror">https://ror.org/01dg04253</institution-id><institution>Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry</institution></institution-wrap><addr-line><named-content content-type="city">Moscow</named-content></addr-line><country>Russian Federation</country></aff><aff id="aff3"><label>3</label><institution-wrap><institution-id institution-id-type="ror">https://ror.org/018159086</institution-id><institution>Institute of Translational Medicine, Pirogov Russian National Research Medical University</institution></institution-wrap><addr-line><named-content content-type="city">Moscow</named-content></addr-line><country>Russian Federation</country></aff><aff id="aff4"><label>4</label><institution-wrap><institution-id institution-id-type="ror">https://ror.org/013w2d378</institution-id><institution>A.A. Kharkevich Institute for Information Transmission Problems of the Russian Academy of Sciences</institution></institution-wrap><addr-line><named-content content-type="city">Moscow</named-content></addr-line><country>Russian Federation</country></aff></contrib-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Kurosaki</surname><given-names>Tomohiro</given-names></name><role>Reviewing Editor</role><aff><institution-wrap><institution-id institution-id-type="ror">https://ror.org/035t8zc32</institution-id><institution>Osaka University</institution></institution-wrap><country>Japan</country></aff></contrib><contrib contrib-type="senior_editor"><name><surname>Diamond</surname><given-names>Betty</given-names></name><role>Senior Editor</role><aff><institution-wrap><institution-id institution-id-type="ror">https://ror.org/05dnene97</institution-id><institution>The Feinstein Institute for Medical Research</institution></institution-wrap><country>United States</country></aff></contrib></contrib-group><author-notes><fn fn-type="con" id="equal-contrib1"><label>†</label><p>These authors contributed equally to this work</p></fn></author-notes><pub-date publication-format="electronic" date-type="publication"><day>15</day><month>09</month><year>2022</year></pub-date><pub-date pub-type="collection"><year>2022</year></pub-date><volume>11</volume><elocation-id>e79254</elocation-id><history><date date-type="received" iso-8601-date="2022-04-08"><day>08</day><month>04</month><year>2022</year></date><date date-type="accepted" iso-8601-date="2022-09-11"><day>11</day><month>09</month><year>2022</year></date></history><pub-history><event><event-desc>This manuscript was published as a preprint at .</event-desc><date date-type="preprint" iso-8601-date="2022-01-01"><day>01</day><month>01</month><year>2022</year></date><self-uri content-type="preprint" xlink:href="https://doi.org/10.1101/2021.12.30.474135"/></event></pub-history><permissions><copyright-statement>© 2022, Mikelov, Alekseeva et al</copyright-statement><copyright-year>2022</copyright-year><copyright-holder>Mikelov, Alekseeva et al</copyright-holder><ali:free_to_read/><license xlink:href="http://creativecommons.org/licenses/by/4.0/"><ali:license_ref>http://creativecommons.org/licenses/by/4.0/</ali:license_ref><license-p>This article is distributed under the terms of the <ext-link ext-link-type="uri" xlink:href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution License</ext-link>, which permits unrestricted use and redistribution provided that the original author and source are credited.</license-p></license></permissions><self-uri content-type="pdf" xlink:href="elife-79254-v3.pdf"/><self-uri content-type="figures-pdf" xlink:href="elife-79254-figures-v3.pdf"/><abstract><p>The stability and plasticity of B cell-mediated immune memory ensures the ability to respond to the repeated challenges. We have analyzed the longitudinal dynamics of immunoglobulin heavy chain repertoires from memory B cells, plasmablasts, and plasma cells from the peripheral blood of generally healthy volunteers. We reveal a high degree of clonal persistence in individual memory B cell subsets, with inter-individual convergence in memory and antibody-secreting cells (ASCs). ASC clonotypes demonstrate clonal relatedness to memory B cells, and are transient in peripheral blood. We identify two clusters of expanded clonal lineages with differing prevalence of memory B cells, isotypes, and persistence. Phylogenetic analysis revealed signs of reactivation of persisting memory B cell-enriched clonal lineages, accompanied by new rounds of affinity maturation during proliferation and differentiation into ASCs. Negative selection contributes to both persisting and reactivated lineages, preserving the functionality and specificity of B cell receptors (BCRs) to protect against current and future pathogens.</p></abstract><kwd-group kwd-group-type="author-keywords"><kwd>memory B cells</kwd><kwd>plasmablasts</kwd><kwd>plasma cells</kwd><kwd>BCR repertoire</kwd><kwd>somatic hypermutation</kwd><kwd>B cell somatic evolution</kwd><kwd>affinity maturation</kwd><kwd>natural selection</kwd></kwd-group><kwd-group kwd-group-type="research-organism"><title>Research organism</title><kwd>Human</kwd></kwd-group><funding-group><award-group id="fund1"><funding-source><institution-wrap><institution>Ministry of Science and Higher Education of the Russian Federation</institution></institution-wrap></funding-source><award-id>075-15-2020-807</award-id><principal-award-recipient><name><surname>Chudakov</surname><given-names>Dmitriy M</given-names></name></principal-award-recipient></award-group><award-group id="fund2"><funding-source><institution-wrap><institution-id institution-id-type="FundRef">http://dx.doi.org/10.13039/501100002261</institution-id><institution>Russian Foundation for Basic Research</institution></institution-wrap></funding-source><award-id>20-34-90153</award-id><principal-award-recipient><name><surname>Alekseeva</surname><given-names>Evgeniia I</given-names></name></principal-award-recipient></award-group><funding-statement>The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.</funding-statement></funding-group><custom-meta-group><custom-meta specific-use="meta-only"><meta-name>Author impact statement</meta-name><meta-value>High degree of clonal persistence and excess of inter-individual convergence are observed in human memory B cell repertoires, along with signatures of both negative and positive selection in most abundant clonal lineages.</meta-value></custom-meta></custom-meta-group></article-meta></front><body><sec id="s1" sec-type="intro"><title>Introduction</title><p>B cells play a crucial role in protection from various pathogens and cancer cells as well as regulation of the immune response (<xref ref-type="bibr" rid="bib1">Akkaya et al., 2020</xref>). The structural diversity of B cell receptors (BCRs) is responsible for the B cell-mediated immune system’s capacity to recognize a wide variety of different antigens, and every individual harbors a large pool of naive B cell clones, each with a unique BCR. Antigenic challenge triggers the proliferation and maturation of naive B cells with cognate BCRs, and the resulting progeny comprise a number of cell subsets with differing functions and lifespans. During the affinity maturation process, the initial structure of a given BCR can change at the genomic level as a result of somatic hypermutation (SHM), a process that accompanies B cell proliferation after antigen-specific activation. Cells bearing BCRs with higher affinity to the antigen are favored during the affinity maturation process, and produce signals that stimulate further differentiation and expansion (<xref ref-type="bibr" rid="bib9">De Silva and Klein, 2015</xref>). Another process called class-switch recombination further increases the dimensionality of the BCR space. The five main classes, or isotypes, of antibodies (i.e<italic>.,</italic> IgA, IgD, IgE, IgG, and IgM) have different functions in the immune response (<xref ref-type="bibr" rid="bib44">Stavnezer et al., 2008</xref>; <xref ref-type="bibr" rid="bib46">Vidarsson et al., 2014</xref>), and isotype switching during clonal proliferation can thereby change the functional capabilities of B cells and the antibodies they produce. As a consequence, antigen challenge yields a population of clonally related cells with different BCRs and functionalities.</p><p>Recently developed immune repertoire sequencing techniques provide valuable insights into the development and structure of B cell immunity with clonal-level resolution (<xref ref-type="bibr" rid="bib6">Chaudhary and Wesemann, 2018</xref>). For example, the clonal relatedness of B cells in a given lineage, as well as the number and dynamics of B cell groups with distinct antigen specificities, can be studied based on BCR sequence homology. Numerous studies have generated valuable data by analyzing repertoire characteristics such as clonal diversity and tissue distribution, magnitude of clonal expansion and BCR SHM, V(D)J usage frequency and distribution of CDR3 length, and the degree of repertoire convergence and individuality (<xref ref-type="bibr" rid="bib5">Briney et al., 2019</xref>; <xref ref-type="bibr" rid="bib42">Soto et al., 2019</xref>; <xref ref-type="bibr" rid="bib40">Shah et al., 2018</xref>; <xref ref-type="bibr" rid="bib23">Mandric et al., 2020</xref>; <xref ref-type="bibr" rid="bib48">Yang et al., 2021</xref>). Studies of BCR repertoires of patients with different diseases have made an important contribution to the understanding of mechanisms of pathology and B cell-mediated immunity (<xref ref-type="bibr" rid="bib2">Bashford-Rogers et al., 2019</xref>; <xref ref-type="bibr" rid="bib30">Nielsen et al., 2020</xref>; <xref ref-type="bibr" rid="bib12">Gaebler et al., 2021</xref>; <xref ref-type="bibr" rid="bib38">Sakharkar et al., 2021</xref>).</p><p>Longitudinal analysis of repertoires at different time points has made it possible to study the dynamics of B cell response following antigenic challenge or therapy (<xref ref-type="bibr" rid="bib22">Laserson et al., 2014</xref>; <xref ref-type="bibr" rid="bib7">Davydov et al., 2018</xref>; <xref ref-type="bibr" rid="bib21">Horns et al., 2019</xref>; <xref ref-type="bibr" rid="bib31">Nourmohammad et al., 2019</xref>; <xref ref-type="bibr" rid="bib20">Hoehn et al., 2021</xref>). Reconstruction of BCR evolution in B cell clonal lineages and phylogenetic analysis can reveal which evolutionary forces predominate at different stages of clonal lineage development. De Bourcy et al. recently reported on age-related differences in the structure of clonal lineages, somatic hypermutagenesis and affinity maturation processes, and differences in recall response of persisting lineages upon vaccination depending on CMV seropositivity status (<xref ref-type="bibr" rid="bib8">de Bourcy et al., 2017</xref>). Other studies have described in detail the action of positive selection in the evolution of clonal lineages in vaccination and chronic HIV infection (<xref ref-type="bibr" rid="bib4">Bonsignori et al., 2017</xref>; <xref ref-type="bibr" rid="bib21">Horns et al., 2019</xref>; <xref ref-type="bibr" rid="bib31">Nourmohammad et al., 2019</xref>). Reports have also described persisting clonal lineages which are predominantly represented by cells with IgM/IgD isotypes, and which demonstrate signs of neutral evolution (<xref ref-type="bibr" rid="bib21">Horns et al., 2019</xref>). Wu et al. observed the clonal stability of plasma cells (PL) in bone marrow (<xref ref-type="bibr" rid="bib47">Wu et al., 2010</xref>), representing the largest fraction of ASCs in the human body. Comparison of BCR repertoires between different cell subsets also makes it possible to investigate factors governing the functional assignment of B cells during proliferation, and thereby to understand fundamental aspects of B cell immunity. For example, recent studies have described differences in BCR repertoires of IgM and switched memory B cells as well as the complex interplay between CD27<sup>high</sup> and CD27<sup>low</sup> B cell memory subsets, showing the complex nature of B cell immune memory (<xref ref-type="bibr" rid="bib47">Wu et al., 2010</xref>; <xref ref-type="bibr" rid="bib18">Grimsholm et al., 2011</xref>).</p><p>BCR repertoires of antigen-experienced B cell subsets and their dynamics are usually studied in the context of pathologic conditions and vaccination, and there is little equivalent data in the absence of acute or chronic immune response. We have therefore investigated immunoglobulin heavy chain repertoires from memory B cells, plasmablasts, and plasma cells from peripheral blood collected from generally healthy volunteers at three time points over the course of a year. In order to obtain detailed and unbiased repertoire data, we used advanced IgH repertoire profiling technology that provides full-length IgH variable region sequences with isotype annotation. Based on comparative and phylogenetic analysis of the resulting data, we are able to describe the structure, distinctive features, clonal relations, isotype distribution and temporal dynamics of B cell subset repertoires, as well as the phylogenetic history of large clonal lineages.</p></sec><sec id="s2" sec-type="results"><title>Results</title><sec id="s2-1"><title>IGH repertoire sequencing statistics and analysis depth</title><p>We collected peripheral blood from six healthy donors at three time points, where the second sample was collected 1 month after the first, and the third was collected 11 months after that (<xref ref-type="fig" rid="fig1">Figure 1A</xref>; <xref ref-type="table" rid="table1">Table 1</xref>). These samples were subjected to fluorescence-activated cell sorting (FACS) to isolate memory B cells (Bmem: CD19<sup>+</sup> CD20<sup>+</sup> CD27<sup>+</sup>), plasmablasts (PBL: CD19<sup>low/+</sup> CD20<sup>-</sup> CD27<sup>high</sup> CD138<sup>-</sup>), and plasma cells (PL: CD19<sup>low/+</sup> CD20<sup>-</sup> CD27<sup>high</sup> CD138<sup>+</sup>; <xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1A</xref>). Most of the cell samples were collected and processed in two independent replicates (<xref ref-type="supplementary-material" rid="fig1sdata1">Figure 1—source data 1</xref>). For each cell sample, we obtained IGH clonal repertoires using a 5’-RACE-based protocol, which allows unbiased amplification of full-length IGH variable domain cDNA while preserving isotype information, with subsequent unique molecular identifier (UMI)-based sequencing data normalization and error correction (<xref ref-type="bibr" rid="bib45">Turchaninova et al., 2016</xref>; <xref ref-type="bibr" rid="bib41">Shugay et al., 2014</xref>). From a total of 83 cell samples, we obtained 1.06×10<sup>7</sup> unique IGH cDNA molecules, each covered by at least three sequencing reads, representing 8.4×10<sup>5</sup> unique IGH clonotypes (<xref ref-type="supplementary-material" rid="fig1sdata1">Figure 1—source data 1</xref>). An IGH clonotype was defined as a unique nucleotide sequence spanning from the beginning of IGH V gene framework 1 to the 5’ end of the C segment, sufficient to determine isotype. The number of unique clonotypes (i.e<italic>.,</italic> species richness) depended on the number of cells per sample (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1B</xref>), even after data normalization by sampling an equal number of unique IGH cDNA sequences. To characterize the number of distinct IGH clonotypes in each cell subset, we selected the samples with the most common number of sorted cells for each sample set. The median number of clonotypes was 20,072 (14,572–32,806, <italic>n</italic>=14) per 5×10<sup>4</sup> memory B cells, 628 (528–981, <italic>n</italic>=8) per 1×10<sup>3</sup> plasmablasts, and 800 (623–1183, <italic>n</italic>=9) per 1×10<sup>3</sup> plasma cells. Rarefaction analysis in the Bmem subpopulation (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1B</xref>, left) revealed an asymptotic increase of species richness that did not reach a plateau, indicating that the averaged species richness can only serve as a lower limit of sample diversity estimation. For all samples of PBL and PL subpopulations, species richness curves plateaued, meaning that we had reached sufficient sequencing depth to evaluate the clonal diversity of the sorted cell samples (<xref ref-type="fig" rid="fig1s1">Figure 1—figure supplement 1B</xref>, center and right).</p><fig-group><fig id="fig1" position="float"><label>Figure 1.</label><caption><title>General characteristics of IGH repertoires in differentiated B cell lineage subsets.</title><p>(<bold>A</bold>) Study design. Peripheral blood from six donors was sampled at three time points: T1 – initial time point, T2 – 1 month, and T3 – 12 months after the start of the study. At each time point, we isolated peripheral blood mononuclear cells (PBMCs) and sorted memory B cells (Bmem: CD19<sup>+</sup> CD20<sup>+</sup> CD27<sup>+</sup>), plasmablasts (PBL: CD19<sup>low/+</sup> CD20<sup>-</sup> CD27<sup>high</sup> CD138<sup>-</sup>), and plasma cells (PL: CD19<sup>low/+</sup> CD20<sup>-</sup> CD27<sup>high</sup> CD138<sup>+</sup>) in two replicates using fluorescence-activated cell sorting (FACS). For each cell sample, we obtained IGH clonal repertoires by sequencing respective cDNA libraries covering full-length IGH variable domain. (<bold>B</bold>) Proportion of isotypes in studied cell subsets averaged across all obtained repertoires. Left, frequency of unique IGH clonotypes with each particular isotype. Right, frequency of each isotype based on IGH cDNA molecules detected in a sample. (<bold>C</bold>) Distribution of the number of somatic hypermutations identified per 100 bp length of IGHV segment for clonotypes within each particular isotype. (<bold>D</bold>) Distribution of CDR3 length of clonotypes in each cell subset by isotype. (<bold>E</bold>) Distributions of average IGHV gene frequencies based on number of clonotypes in naive B cells (data from <xref ref-type="bibr" rid="bib16">Gidoni et al., 2019</xref>), Bmem, PBL, and PL repertoires are shown at the top. Colored squares on heatmap indicate significantly different (false discovery rate, FDR < 0.01) frequencies for IGHV gene segments in corresponding B cell subsets compared to naive B cell repertoires. Color intensity reflects the magnitude of the difference (FC = fold change). Only V genes represented by more than two clonotypes on average are shown, data normalization was performed using trimmed mean of M values method (<xref ref-type="bibr" rid="bib37">Robinson and Oshlack, 2010</xref>). IGHV gene segments are clustered based on the similarity of their amino acid sequence, as indicated by the dendrogram at the bottom. In C and D, the numbers at the bottom of the plots represent the number of clonotypes in the corresponding group, pooled from all donors, and the median measurements from each cell type. Comparisons between subsets were performed with two-sided Mann-Whitney U test. *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>, ****=p ≤ 10<sup>–4</sup>.</p><p><supplementary-material id="fig1sdata1"><label>Figure 1—source data 1.</label><caption><title>Reperoire sequencing statistics for cell samples.</title></caption><media mimetype="application" mime-subtype="xlsx" xlink:href="elife-79254-fig1-data1-v3.xlsx"/></supplementary-material></p><p><supplementary-material id="fig1sdata2"><label>Figure 1—source data 2.</label><caption><title>Isotype frequencies per cell sample.</title></caption><media mimetype="application" mime-subtype="xlsx" xlink:href="elife-79254-fig1-data2-v3.xlsx"/></supplementary-material></p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig1.jpg"/></fig><fig id="fig1s1" position="float" specific-use="child-fig"><label>Figure 1—figure supplement 1.</label><caption><title>FACS gating strategy and rarefaction analysis of clonal diversity for memory B cells, plasmablasts and plasma cells.</title><p>(<bold>A</bold>) Fluorescence-activated cell sorting (FACS) gating strategy and frequencies of the following studied cell subsets for a representative peripheral blood sample (donor IZ, time point T3): Memory B cells (Bmem: CD19<sup>+</sup> CD20<sup>+</sup> CD27<sup>+</sup>), plasmablasts (PBL: CD19<sup>low/+</sup> CD20<sup>-</sup> CD27<sup>high</sup> CD138<sup>-</sup>), and plasma cells (PL: CD19<sup>low/+</sup> CD20<sup>-</sup> CD27<sup>high</sup> CD138<sup>+</sup>). (<bold>B</bold>) Rarefaction curves for IGH cDNA molecules. From each repertoire, we sampled a defined number of unique IGH cDNA molecules and determined the number of unique IGH clonotypes. Each line represents a single sample. Shading of the lines indicates the number of cells sampled for each curve.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig1-figsupp1.jpg"/></fig><fig id="fig1s2" position="float" specific-use="child-fig"><label>Figure 1—figure supplement 2.</label><caption><title>Isotype frequencies in studied subsets.</title><p>(<bold>A</bold>) Isotype frequencies in studied cell subsets and bulk peripheral blood mononuclear cells (PBMCs) averaged across all samples. Frequencies were calculated as the number of IGH clonotypes (unique nucleotide sequences covering the full-length VDJregion), with specific isotypes divided by total number of clonotypes (left), or as the number of cDNA molecules in each isotype divided by the total number of cDNA molecules. (<bold>B</bold>) Isotype frequencies based on unique clonotypes for different cell types in each individual donor sample. Whiskers illustrate minimal and maximal isotype frequencies for the group. Black and gray lines at the bottom of the plot indicate groups of bars corresponding to a particular donor.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig1-figsupp2.jpg"/></fig><fig id="fig1s3" position="float" specific-use="child-fig"><label>Figure 1—figure supplement 3.</label><caption><title>The level of SHM in isotypes.</title><p>Number of SHMs identified per 100 bp length of IGHV for clonotypes within each individual repertoire for particular isotype. Numbers below each box represent the number of observations (individual clonotypes) and median number of SHMs. Comparisons between isotypes were performed with two-sided Mann-Whitney U test. *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>, ****=p ≤ 10<sup>–4</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig1-figsupp3.jpg"/></fig><fig id="fig1s4" position="float" specific-use="child-fig"><label>Figure 1—figure supplement 4.</label><caption><title>IGHV gene frequencies in studied cell subsets.</title><p>(<bold>A</bold>) Distributions of average IGHV gene frequencies in repertoires of total B cells, naive B cells (from <xref ref-type="bibr" rid="bib16">Gidoni et al., 2019</xref>), memory B (Bmem), plasmablasts (PBL), and plasma (PL) cells. (<bold>B</bold>) Heatmaps of IGHV frequencies for individual donors. Colored squares on heatmap indicate significantly different (false discovery rate [FDR] < 0.01) IGHV gene segment usage frequency in corresponding B cell subsets vs. publicly available naive B cell repertoires. Color intensity reflects magnitude difference (FC = fold change). Only V genes represented by more than two clonotypes on average are shown. IGHV gene segments are ordered by similarity of their amino acid sequence, as indicated by the dendrogram at the bottom.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig1-figsupp4.jpg"/></fig></fig-group><table-wrap id="table1" position="float"><label>Table 1.</label><caption><title>Donor demographics and cell sample sizes.</title><p>Multiple values in a cell separated by a semicolon represent replicates collected for the corresponding donor, time point, or cellular subset. AR – allergic rhinitis; FA – food allergy; HD – healthy donor.</p></caption><table frame="hsides" rules="groups"><thead><tr><th align="left" valign="top" rowspan="2" colspan="4">Time point:</th><th align="left" valign="top" colspan="9">Number of cells per sample</th></tr><tr><th align="left" valign="top" colspan="3">T1</th><th align="left" valign="top" colspan="3">T2</th><th align="left" valign="top" colspan="3">T3</th></tr></thead><tbody><tr><td align="left" valign="top"><bold>Donor ID</bold></td><td align="left" valign="top"><bold>Age</bold></td><td align="left" valign="top"><bold>Sex</bold></td><td align="left" valign="top"><bold>Status</bold></td><td align="left" valign="top"><bold>Bmem</bold></td><td align="left" valign="top"><bold>PBL</bold></td><td align="left" valign="top"><bold>PL</bold></td><td align="left" valign="top"><bold>Bmem</bold></td><td align="left" valign="top"><bold>PBL</bold></td><td align="left" valign="top"><bold>PL</bold></td><td align="left" valign="top"><bold>Bmem</bold></td><td align="left" valign="top"><bold>PBL</bold></td><td align="left" valign="top"><bold>PL</bold></td></tr><tr><td align="left" valign="top">D01</td><td align="char" char="." valign="top">27</td><td align="left" valign="top">F</td><td align="left" valign="top">AR</td><td align="left" valign="top">n/a</td><td align="left" valign="top">n/a</td><td align="left" valign="top">n/a</td><td align="char" char="." valign="top">50,300;<break/>55,400</td><td align="char" char="." valign="top">2100;<break/>2100</td><td align="char" char="." valign="top">1020;<break/>1010</td><td align="char" char="." valign="top">50,000;<break/>50,000</td><td align="char" char="." valign="top">1000;<break/>1000</td><td align="char" char="." valign="top">500;<break/>500</td></tr><tr><td align="left" valign="top">IM</td><td align="char" char="." valign="top">39</td><td align="left" valign="top">M</td><td align="left" valign="top">AR,FA</td><td align="char" char="." valign="top">186,572</td><td align="char" char="." valign="top">2200</td><td align="char" char="." valign="top">129</td><td align="char" char="." valign="top">69,900;<break/>68,400</td><td align="char" char="." valign="top">2000;<break/>2486</td><td align="char" char="." valign="top">920</td><td align="char" char="." valign="top">50,000;<break/>50,000</td><td align="char" char="." valign="top">2000;<break/>2000</td><td align="char" char="." valign="top">1000;<break/>1000</td></tr><tr><td align="left" valign="top">MRK</td><td align="char" char="." valign="top">27</td><td align="left" valign="top">M</td><td align="left" valign="top">AR</td><td align="char" char="." valign="top">143,162</td><td align="char" char="." valign="top">5336</td><td align="char" char="." valign="top">251</td><td align="char" char="." valign="top">51,700;<break/>50,600</td><td align="char" char="." valign="top">2130;<break/>2020</td><td align="char" char="." valign="top">1000;<break/>1035</td><td align="char" char="." valign="top">50,000;<break/>50,000</td><td align="char" char="." valign="top">1000;<break/>1000</td><td align="char" char="." valign="top">400;<break/>200</td></tr><tr><td align="left" valign="top">AT</td><td align="char" char="." valign="top">23</td><td align="left" valign="top">M</td><td align="left" valign="top">AR,FA</td><td align="char" char="." valign="top">101,400</td><td align="char" char="." valign="top">7200</td><td align="char" char="." valign="top">1,800</td><td align="char" char="." valign="top">50,600;<break/>57,400</td><td align="char" char="." valign="top">2520</td><td align="char" char="." valign="top">800</td><td align="char" char="." valign="top">50000;<break/>40800</td><td align="char" char="." valign="top">1000;<break/>1000</td><td align="char" char="." valign="top">400;<break/>200</td></tr><tr><td align="left" valign="top">IZ</td><td align="char" char="." valign="top">33</td><td align="left" valign="top">M</td><td align="left" valign="top">HD</td><td align="char" char="." valign="top">101,800</td><td align="char" char="." valign="top">3900</td><td align="char" char="." valign="top">850</td><td align="char" char="." valign="top">50,500;<break/>56,300</td><td align="char" char="." valign="top">1140;<break/>1840</td><td align="char" char="." valign="top">1050;<break/>625</td><td align="char" char="." valign="top">50,000;<break/>50,000</td><td align="char" char="." valign="top">2000;<break/>2000</td><td align="char" char="." valign="top">200;<break/>200</td></tr><tr><td align="left" valign="top">MT</td><td align="char" char="." valign="top">33</td><td align="left" valign="top">F</td><td align="left" valign="top">HD</td><td align="left" valign="top">n/a</td><td align="left" valign="top">n/a</td><td align="left" valign="top">n/a</td><td align="left" valign="top">n/a</td><td align="left" valign="top">n/a</td><td align="left" valign="top">n/a</td><td align="char" char="." valign="top">50,000;<break/>50,000</td><td align="char" char="." valign="top">1000;<break/>1000</td><td align="char" char="." valign="top">400</td></tr></tbody></table></table-wrap></sec><sec id="s2-2"><title>B cell subsets display both divergent and similar characteristics in their IGH repertoires</title><p>First, we aimed to characterize features of the IGH repertoires of the Bmem, PBL, and PL subset based on several key properties: usage of germline-encoded IGHV segments, clonal distribution by isotypes, rate of SHM in CDR1-2 and FWR1-3, and features of the hypervariable CDR3 region. The proportion of overall clonal diversity occupied by the five major IGH isotypes was strikingly different between Bmem cells and antibody-secreting cells (ASCs; i.e<italic>.,</italic> PBL and PL). IgM represented more than half of the repertoire in Bmem, while IgA was dominant in PBL and PL (<xref ref-type="fig" rid="fig1">Figure 1B</xref>, <xref ref-type="supplementary-material" rid="fig1sdata2">Figure 1—source data 2</xref>). The second most prevalent isotype in ASCs was IgG, which was also less abundant in Bmem compared to IgA. IgD represented a substantial part of the Bmem clonal repertoire, while <1% clonotypes of ASCs expressed IgD. The proportion of each isotype varied between donors and time points, but IgM and IgA or IgA and IgG consistently remained the most abundant isotypes in Bmem cells or ASCs, respectively (<xref ref-type="fig" rid="fig1s2">Figure 1—figure supplement 2A</xref>, <xref ref-type="supplementary-material" rid="fig1sdata2">Figure 1—source data 2</xref>). In all studied subsets, the isotype distribution in terms of number of unique clonotypes roughly mirrored the isotype distribution based on the number of IGH cDNA molecules, indicating absence of large clonal expansions or differences in IGH expression level distorting abundance of isotypes. This could not be determined by sequencing of bulk peripheral blood mononuclear cells (PBMCs), as higher levels of IGH expression by ASCs can change the isotype proportions and thereby bias the quantitation of clonotype abundance (<xref ref-type="fig" rid="fig1s2">Figure 1—figure supplement 2B</xref>). The obtained IGH isotype distributions based on unique clonotypes roughly correspond to the distribution of IGH isotypes typically detected by flow cytometry of the same subsets (<xref ref-type="bibr" rid="bib33">Perez-Andres et al., 2010</xref>).</p><p>The level of SHM was on average significantly higher in ASC subsets, reflecting that PBLs and PLs are enriched for clones that have undergone affinity maturation (<xref ref-type="fig" rid="fig1">Figure 1C</xref>). The switched isotypes (IgG, IgA) had higher average levels of SHMs in the Bmem subset compared with IgM and IgD isotypes. Interestingly, the SHM level of IgD clonotypes in ASC subsets was significantly higher compared with Bmem. The average number of SHMs for IgE clonotypes did not differ significantly between cell subsets, but was significantly higher compared to the level of SHM detected for IgM and IgD clonotypes in Bmem (<xref ref-type="fig" rid="fig1">Figure 1C</xref>, <xref ref-type="fig" rid="fig1s3">Figure 1—figure supplement 3</xref>). Of note, the rate of SHM in PBLs was higher than that in PLs in clonotypes from the three most abundant isotypes (i.e<italic>.,</italic> IgM, IgA, and IgG). We also compared the distributions of the lengths of the hypervariable CDR3 region between IGH clonotypes in different cell subsets. PBLs had significantly longer CDR3 regions compared to Bmem cells on average in every isotype except for IgE (<xref ref-type="fig" rid="fig1">Figure 1D</xref>). Of note, the average CDR3 length in PL clonotypes was significantly higher compared to Bmem for IgA and IgD, but not for the other isotypes.</p><p>IGHV gene segment usage was roughly similar between Bmem, PBL, and PL cells from all donors, indicating generally equal probabilities of memory-to-ASC conversion for B cells carrying BCRs encoded by distinct gene segments (<xref ref-type="fig" rid="fig1">Figure 1E</xref>, <xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4A</xref>). This distribution differed significantly between the studied cell subsets and naive B cells (based on data from <xref ref-type="bibr" rid="bib16">Gidoni et al., 2019</xref>). The repertoire of total B cells (Btot; CD19<sup>+</sup> CD20<sup>+</sup>), which contained a large fraction of naive B cells, demonstrated similar IGHV gene segment usage to the naive B cell repertoire (<xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4A</xref>). We observed statistically significant Pearson correlations in terms of IGHV gene frequencies for all pairs between Bmem, PBL, or PL (>0.95 correlation, p<0.01), and for naive vs. Btot (0.79 correlation, p<0.01). We observed high concordance in terms of under- or overrepresentation of specific IGHV gene segments in repertoires of all antigen-experienced B cell subsets compared to naive B cells; Pearson correlation coefficients for the fold-change of IGHV gene segment usage frequencies were 0.95 for Bmem and PBL, 0.96 for Bmem and PL, and 0.98 for PBL and PL (p<0.01 for all pairs). Moreover, IGHV gene segment under- or overrepresentation clearly depended on the given gene sequence. We clustered IGHV genes based on their sequence similarity, and observed that most IGHV segments in each of the four major clusters behaved concordantly with other segments in that cluster (<xref ref-type="fig" rid="fig1">Figure 1E</xref>). This effect was also observed at the level of individual repertoires (<xref ref-type="fig" rid="fig1s4">Figure 1—figure supplement 4B</xref>) with discrepancies that could probably be attributed to genetic polymorphism of the IGH loci of particular donors.</p><p>These observations highlight the differences in general characteristics of IGH repertoire between the Bmem and ASC subsets, and demonstrate similarity of IGHV gene usage that differs from that in naive B cells.</p></sec><sec id="s2-3"><title>Memory B cell repertoires are stable over time and contain a large number of public clonotypes</title><p>We further studied the similarity of IGH clonal repertoires of B cell subsets across time points and between individuals, evaluating repertoire stability (i.e<italic>.,</italic> distance between different time points) and degree of individuality (i.e<italic>.,</italic> distance between repertoires from different donors). We evaluated repertoire similarity at two levels of IGH sequence identity: frequency of clonotypes with identical nucleotide sequence-defined variable regions (FR1–4), and number of clonotypes with identical CDR3 amino acid sequences, IGHV gene segments, and isotypes. Both metrics showed significantly higher inter-individual differences compared to the divergence of repertoires derived from the same donor, reflecting the fact that IGH repertoires of Bmem, PBL, and PL subsets are private to a large degree (<xref ref-type="fig" rid="fig2">Figure 2A and B</xref>). We observed identical clonotypes in the repertoires of PBL and PL collected at different time points, whereas the repertoire similarity was much lower compared to that between replicate samples, reflecting the transient nature of PBL and PL populations in peripheral blood. Notably, we observed lower clonal overlap in PBL and PL for more distant time points (separated by 11 or 12 months) than those that are closer together (1 month) (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1A</xref>). The dissimilarity between samples collected on the same day vs. 1 month or even 1 year later was much lower for Bmem, demonstrating the high stability of the clonal repertoire and long-term persistence of IGH clonotypes in these cells (<xref ref-type="fig" rid="fig2">Figure 2B</xref>, <xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1A</xref>).</p><fig-group><fig id="fig2" position="float"><label>Figure 2.</label><caption><title>Memory B cells (Bmem), plasmablasts (PBL), and plasma cells (PL) IGH repertoire stability over time and similarity between individuals.</title><p>(<bold>A</bold>) Distance between repertoires obtained at different time points from the same or different donors as calculated by Jensen-Shannon divergence index for IGHV gene frequency distribution. (<bold>B</bold>) Number of shared clonotypes between pairs of repertoires from the same or different donors and time points. For data normalization, we assessed the most abundant 14,000 Bmem, 600 PBL, and 300 PL clonotypes. (<bold>C</bold>) The average number of shared clonotypes between repertoires from pairs of unrelated donors for the most abundant Bmem clonotypes, randomly selected Bmem clonotypes, most abundant clonotypes from naive repertoires of unrelated donors (from <xref ref-type="bibr" rid="bib16">Gidoni et al., 2019</xref>), or from synthetic repertoires generated with OLGA software; each repertoire in comparison was represented by a fixed number of clonotypes (5000), either most abundant, randomly selected, or generated where indicated. (<bold>D</bold>) Inter-individual distance between distributions of V genes in repertoires, calculated as Jensen-Shannon divergence indices for the pairs of repertoires depicted in C. (<bold>E</bold>) Fraction of persistent clonotypes detected at more than one time point among clonotypes detected in repertoires from only one donor (private) or in at least two donors (public). Each dot represents the fraction of persistent clonotypes from one donor. In all plots, clonotypes are defined as having identical CDR3 amino acid sequences and the same IGHV gene segment and isotype. For A–D, each dot represents a pair of repertoires of the corresponding type; N indicates the number of pairs of repertoires in the group. Comparisons in all panels were performed with two-sided Mann-Whitney U test. *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>, ****=p ≤ 10<sup>–4</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig2.jpg"/></fig><fig id="fig2s1" position="float" specific-use="child-fig"><label>Figure 2—figure supplement 1.</label><caption><title>IGH repertoire similarity within B cell lineage subpopulations.</title><p>(<bold>A</bold>) Number of shared clonotypes from each B cell subset between repertoires from a given donor at various time points. ‘Same’ represents replicate samples from the same blood draw, ‘close’ samples were collected within ~1-month interval of each other, and ‘distant’ samples were separated by an ~11- to 12-month interval. (<bold>B</bold>) Number of shared clonotypes between pairs of repertoires from different donors with different clonotype definitions used to calculate overlap: aaCDR3=amino acid CDR3 sequence and V gene label; aaCDR3 not nt = amino acid CDR3 sequence and V gene label, excluding clonotypes with identical CDR3 nucleotide sequence; ntCDR3=nucleotide CDR3 sequence and V gene label; ntVDJRegion = full nucleotide sequence from the beginning of the IGH Framework 1 region to the end of the IGH Framework 4 region. (<bold>C</bold>) Distribution of the number of somatic hypermutations identified per 100 bp length of IGHV segment for clonotypes detected either in only one donor (private) or in at least two donors (public). (<bold>D</bold>) Shared clonotype frequency between pairs of repertoires, calculated as in <xref ref-type="bibr" rid="bib5">Briney et al., 2019</xref>. For normalization purposes, we considered the 14,000 most abundant Bmem clonotypes, the 600 most abundant PBL clonotypes, and the 300 most abundant PL clonotypes. Dots in each plot represent pairs of repertoires of corresponding type. <italic>N</italic>=the number of pairs of repertoires in the group, and med = the median value. Comparisons in all panels were performed with Mann-Whitney test. *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>, ****=p ≤ 10<sup>–4</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig2-figsupp1.jpg"/></fig></fig-group><p>To better describe the inter-individual IGH repertoire convergence, we analyzed the number of IGH amino acid clonotypes shared between different donors (i.e<italic>.,</italic> public clonotypes) among 5000 most expanded clonotypes in each Bmem repertoire, assuming that functional convergence could be detected among the most abundant clonotypes due to clonal expansions in response to common pathogens. Indeed, the average number of shared clonotypes in Bmem was significantly higher between fractions of the most abundant clonotypes compared to randomly sampled clonotypes (<xref ref-type="fig" rid="fig2">Figure 2C</xref>), as well as when compared to the most abundant clonotypes shared by two naive repertoires (from <xref ref-type="bibr" rid="bib16">Gidoni et al., 2019</xref>) or to pre-immune IGH repertoires obtained by in silico generation using OLGA software (<xref ref-type="bibr" rid="bib39">Sethna et al., 2019</xref>; <xref ref-type="fig" rid="fig2">Figure 2C</xref>). We also noted that there were no shared clonotypes defined by their full-length nucleotide sequence (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1B</xref>). Public clonotypes were also hypermutated, although the rate of SHM was slightly lower compared to that in clonotypes specific to one donor (private) (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1C</xref>). These observations indicate functional convergence in Bmem repertoires, which is presumably driven by exposure to common pathogens. Of note, the extent of clonal overlap was significantly higher between naive repertoires than for in silico-generated repertoires, indicating functional convergence even in pre-immune repertoires. Furthermore, the distance between V segment usage distributions in Bmem repertoires was not significantly different compared to that in naive B cells repertoires. That indicates that the higher clonotype sharing seen in Bmem cannot be attributed to lower diversity in IGHV germline usage (<xref ref-type="fig" rid="fig2">Figure 2D</xref>). The same analysis in PBL and PL subpopulations for the 600 and 200 most abundant clonotypes respectively yielded no shared clonotypes between repertoires of different donors, demonstrating no detectable convergence at this sampling depth. Finally, we found that public clonotypes were more likely to be detected than private ones in samples collected at different time points (<xref ref-type="fig" rid="fig2">Figure 2E</xref>), again suggesting persistent memory to common antigens. Thus, the results demonstrate the level of stability of memory BCR repertoires and extent of clonal sharing in repertoires of unrelated donors, which might be attributed to exposure to common antigens.</p></sec><sec id="s2-4"><title>Temporal dynamics of clonal lineages are associated with cell subset composition</title><p>SHM during BCR affinity maturation leads to the formation of clonal lineages – that is, BCR clonotypes evolved from a single ancestor after B cell activation. To study the structure and dynamics of clonal lineages originating from a single BCR ancestor, we grouped clonotypes from each individual based on their sequence similarity (see Materials and methods for details). We focused on larger clonal lineages consisting of at least 20 unique clonotypes from the corresponding donor. On average, these clonal lineages covered 3.4% of a given donor’s repertoire, and we identified 190 such lineages across the four donors from whom samples were collected at each of the three time points (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1A</xref>).</p><p>First we asked how B cell subsets and isotypes were represented in these most abundant clonal lineages. The clonal lineages were mostly composed of Bmem cell clonotypes of non-switched isotype IgM or were largely composed of ASCs, and enriched in IgG and IgA clonotypes (<xref ref-type="fig" rid="fig3s1">Figure 3—figure supplement 1B</xref>). To investigate the nature of such bimodal distribution and perform comparative analysis of these two types of clonal lineages, we divided them into two large clusters using <italic>k</italic>-means clustering algorithm, based on the proportion of represented cell subsets and BCR isotypes (<xref ref-type="fig" rid="fig3">Figure 3A and B</xref>, <xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2A</xref>). The more abundant HBmem cluster included 138 clonal lineages, and was mostly composed of Bmem clonotypes of non-switched isotype IgM. Conversely, the smaller LBmem cluster (52 clonal lineages) was more diverse and largely composed of ASCs, and enriched in IgG and IgA clonotypes. The average size of clonal lineages (i.e., the number of unique clonotypes per lineage) did not differ between the HBmem and LBmem clusters (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2B</xref>), and both clusters were present in repertoires of all donors (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2C</xref>).</p><fig-group><fig id="fig3" position="float"><label>Figure 3.</label><caption><title>Temporal dynamics and composition of clonal lineages.</title><p>(<bold>A</bold>) Principal component analysis (PCA) of clonal lineage composition: proportions of memory B (Bmem), plasmablasts (PBL), and plasma (PL) cells as well as proportions of isotypes. Arrows represent projections of the corresponding variables onto the two-dimensional PCA plane, with lengths reflecting how well the variable explains the variance of the data. The two principal components (PC1 and PC2) cumulatively explain 90.9% of the variance. Clonal lineages are colored according to the clusters they were assigned to by the <italic>k</italic>-means algorithm. (<bold>B</bold>) Proportion of clonotypes from the various cell subset or isotypes in clonal lineages falling into the HBmem or LBmem clusters. (<bold>C</bold>) Dynamics of clonal lineage frequency, defined as the number of clonotypes in a lineage divided by the total number of clonotypes detected at a given time point. Each line connects points representing a unique clonal lineage (<italic>N</italic>=190). (<bold>D</bold>) Schematic representation of how we calculated clonal lineage persistence. <inline-formula><mml:math id="inf1"><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the maximum clonal lineage frequency among the three time points, and <inline-formula><mml:math id="inf2"><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are the frequencies at the remaining two time points. (<bold>E</bold>) Spearman’s correlation between persistence of a clonal lineage and proportions of its clonotypes associated with a given B cell subset or isotype. (<bold>F</bold>) Comparison of persistence between HBmem and LBmem. (<bold>G</bold>) Fraction of public clonal lineages in the two clusters. Statistical significance for B, F, and G is calculated by the two-sided Mann-Whitney test. *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>, ****=p ≤ 10<sup>–4</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig3.jpg"/></fig><fig id="fig3s1" position="float" specific-use="child-fig"><label>Figure 3—figure supplement 1.</label><caption><title>Characteristics of the most abundant clonal lineages.</title><p>(<bold>A</bold>) Proportion of IGH clonotype diversity occupied by the most abundant clonal lineages (>19 unique clonotypes). (<bold>B</bold>) Distributions of fractions of cellular subtypes and isotypes in most abundant clonal lineages.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig3-figsupp1.jpg"/></fig><fig id="fig3s2" position="float" specific-use="child-fig"><label>Figure 3—figure supplement 2.</label><caption><title>Size, reproducibility and dynamics of HBmem and LBmem clonal lineages.</title><p>(<bold>A</bold>) Scree plot for the principal component analysis (PCA) from <xref ref-type="fig" rid="fig3">Figure 3A</xref> of the composition of clonal lineages, where fractions of memory B (Bmem), plasmablasts (PBL), plasma (PL) cells, and fractions of IgM, IgG, and IgA were used as variables. (<bold>B</bold>) Distribution of the number of unique clonotypes in a lineage for HBmem and LBmem. (<bold>C</bold>) The number of clonal lineages belonging to HBmem or LBmem clusters in each donor. (<bold>D</bold>) Dynamics of clonal lineage frequency from <xref ref-type="fig" rid="fig3">Figure 3C</xref> in individual donors.</p><p>Lineage frequency is defined as the number of clonotypes in a lineage divided by the total number of clonotypes detected at a given time point. Each line connects points representing a unique clonal lineage. (<bold>E</bold>) Spearman’s correlation between frequencies of clonal lineages in two replicates of time point 3 (<bold>T3</bold>) samples. Only clonal lineages sampled with at least one replica at this time point were included in the analysis. (<bold>F</bold>) Spearman’s correlation between the size of a clonal lineage and its persistence. (<bold>G</bold>) Fraction of clonotypes in HBmem or LBmem clonal lineages detected at two or three time points.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig3-figsupp2.jpg"/></fig></fig-group><p>Next we tracked the abundance of each clonal lineage in the repertoire across each time point. The two clusters of lineages demonstrated different temporal behavior; while HBmem groups were quite stable over time, LBmem lineages had a burst of increased frequency at one of the time points (<xref ref-type="fig" rid="fig3">Figure 3C</xref>). To compare the temporal stability of clonal lineages, we defined the lineage persistence metric, which equals 1 when a clonal lineage was equally frequent at all three time points and is close to 0 when it was detected at just one time point (<xref ref-type="fig" rid="fig3">Figure 3D</xref>). Persistence of a clonal lineage was strongly associated with its composition (<xref ref-type="fig" rid="fig3">Figure 3E and F</xref>). Clonal lineages enriched with clonotypes or with the IgM isotype – including all HBmem lineages – were more likely to persist through time. Conversely, lineages with larger proportions of ASCs or IgG/IgA isotypes, including most LBmem lineages, tended to have lower persistence, with a burst of increased frequency at one particular time point. The time point of LBmem frequency burst varied between donors (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2D</xref>). The frequencies of clonal lineages were highly correlated among replicate samples, and the persistence of a clonal lineage was not associated with its size (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2E,F</xref>), indicating that differences in persistence cannot be attributed to clonotype sampling noise.</p><p>Besides their higher persistence, the HBmem lineages were enriched in clonotypes detected at multiple time points (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2G</xref>), indicating that persistent clonal lineages are supported by persistent clonotypes. Furthermore, 29.7% of the HBmem cluster was represented by public clonal lineages shared between at least two donors, compared to 3.8% for the LBmem cluster. The only two shared LBmem lineages had atypically high persistence, which made them more similar to HBmem (<xref ref-type="fig" rid="fig3">Figure 3G</xref>).</p><p>Thus, we observed two types of clonal lineages, representing different stages of an immune response: persisting memory with unswitched IgM isotype (HBmem) and responding lineages rapidly increasing in frequency and producing IgG or IgA antibodies (LBmem).</p></sec><sec id="s2-5"><title>LBmem clonal lineages could arise from HBmem clonal lineages</title><p>The evolutionary past of a clonal lineage can be described by inferring the history of accumulation of SHMs leading to individual clonotypes <italic>–</italic> that is, by reconstructing the phylogenetic tree of the clonal lineage. The initial germline sequence of each clonal lineage partially matches the germline VDJ segments, and can be reconstructed in a manner corresponding to the root of the phylogenetic tree of this lineage (see Materials and methods). However, the first node of the phylogenetic tree (green diamond in <xref ref-type="fig" rid="fig4">Figure 4A</xref>), the most recent common ancestor (MRCA) of the sampled part of the lineage, can be different from the inferred germline sequence. These differences, referred to as the G-MRCA distance, correspond to SHMs accumulated during the evolution of the clonal lineage prior to divergence of the observed clonotypes. The G-MRCA distance depends on how clonotypes of the tree were sampled. Sampling of clonotypes regardless of their position on the tree results in a low G-MRCA distance (<xref ref-type="fig" rid="fig4">Figure 4A</xref>, top panel), while sampling just those clonotypes belonging to a particular clade can conceal early stages of lineage evolution and thus result in a large G-MRCA distance (<xref ref-type="fig" rid="fig4">Figure 4A</xref>, bottom panel).</p><fig-group><fig id="fig4" position="float"><label>Figure 4.</label><caption><title>Phylogenetic history of HBmem and LBmem clonal lineages.</title><p>(<bold>A</bold>) A schematic illustration of how the distances between the germline sequence and the most recent common ancestor (MRCA) of a clonal lineage (G-MRCA distance) vary depending on which subset of clonotypes is sampled: a sample uniform with regard to the position on the tree (top panel), or only those belonging to a particular clade of the tree (bottom panel). (<bold>B</bold>) Comparison of G-MRCA p-distance (i.e., the fraction of differing nucleotides) for HBmem and LBmem lineages. (<bold>C</bold>) Mean pairwise phylogenetic distance (i.e., the distance along the tree) between clonotypes of the same lineage for HBmem and LBmem clusters. (<bold>D–F</bold>) Representative phylogenetic trees for clonal lineages belonging to HBmem (<bold>D</bold>), LBmem (<bold>E</bold>), and an example of HBmem-LBmem transition (<bold>F</bold>). The LBmem sublineage in F is nested deep in the phylogeny of the memory clonotypes, and is not characterized by a particularly long ancestral branch, indicating that it is not an artifact of clonal lineage assignment. Circles correspond to individual clonotypes, with the cellular subset indicated by color, and the isotype by label. Tables at right indicate the presence or absence of the corresponding clonotype at each time point. The G-MRCA distance is indicated with a thick line. (<bold>G–I</bold>) Schematic representation of the hypothetical dynamics of relative size for clonal lineages represented in D, E, and F, respectively. Significance for B and C was obtained by the two-sided Mann-Whitney test. *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>, ****=p ≤ 10<sup>–4</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig4.jpg"/></fig><fig id="fig4s1" position="float" specific-use="child-fig"><label>Figure 4—figure supplement 1.</label><caption><title>Analysis of the sequence of clonotypes comprising the lineage with HBmem-LBmem the transition.</title><p>Phylogenetic tree and nucleotide (<bold>A</bold>) and amino acid (<bold>B</bold>) alignments of CDR regions of clonal lineage with example of HBmem-LBmem transition from <xref ref-type="fig" rid="fig4">Figure 4F</xref>. The order of rows in the alignment corresponds to the order of clonotypes on the phylogenetic tree. Rows of the alignment, corresponding to the LBmem-like clade, are indicated by dotted lines. On panel B amino acid residues are colored according to their physicochemical properties. Asterisks indicate conservative positions among all clonotypes of the lineage.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig4-figsupp1.jpg"/></fig></fig-group><p>The G-MRCA distance was on average fivefold higher in LBmem clonal lineages compared to HBmem (median = 0.044 vs. 0.008, <xref ref-type="fig" rid="fig4">Figure 4B</xref>). This means that even though nearly all the evolution of an HBmem clonal lineage leaves a trace in the observed diversity of that lineage (<xref ref-type="fig" rid="fig4">Figure 4D and G</xref>), the sequence variants of an LBmem lineage typically result from divergence of an already-hypermutated clonotype (<xref ref-type="fig" rid="fig4">Figure 4E and H</xref>). In most (38 out of 52) LBmem lineages, some Bmem clonotypes were observed at the time point preceding expansion. Moreover, clonotypes of LBmem lineages are typically characterized by lower pairwise divergence compared to that in HBmem lineages (median = 0.11 vs. 0.13, <xref ref-type="fig" rid="fig4">Figure 4C</xref>). Together with the burst-like dynamics characteristic of LBmem lineages (<xref ref-type="fig" rid="fig3">Figure 3F</xref>), this implies that LBmem lineages may represent recent, rapid clonal expansion of pre-existing memory.</p><p>Based on these results and the compositional features of the two clusters, we further hypothesized that LBmem clonal lineages may arise from reactivation of pre-existing memory cells belonging to the HBmem cluster. In search of examples of such a transition, we examined all clonal lineages that were persistent but included ASC clonotypes. We found one clear example of a transition from HBmem to LBmem state in the evolutionary history of a clonal lineage (<xref ref-type="fig" rid="fig4">Figure 4F,I</xref>). While the MRCA of this lineage nearly matched the germline sequence, all ASC clonotypes were grouped in a single monophyletic clade (sublineage), such that its ancestral node was remote from the MRCA. The ASC sublineage demonstrated all features characteristic of LBmem, including predominance of IgG and IgA isotypes, low persistence, and low clonotype divergence. Conversely, the remainder of the clonal lineage had features of HBmem: predominance of IgM, high persistence, and high levels of clonotype divergence. Position of ASC sublineage on a distant node from the root of the tree indicates gradual accumulation of SHMs, distinguishing the ASC sublineage from the remaining clonotypes. This fact together with the similarity of CDR3 regions of lineage clonotypes (<xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1</xref>) give a reason to conclude that the ASC sublineage has the same origin as the remaining part of the tree with features of HBmem cluster.</p><p>To summarize, we observed that LBmem lineages had low level of clonotype divergence and large distance of lineage’s ancestor from the germline sequence, assuming their recent origin from a mature clonotype. The temporal dynamics of LBmem, detection of Bmem clonotypes at the time point prior to the LBmem lineage expansion, and the relationship between HBmem and LBmem on a clonal lineage level suggest that LBmem expansions may result from reactivation of pre-existing memory.</p></sec><sec id="s2-6"><title>Reactivation of LBmem clonal lineages is driven by positive selection</title><p>Having shown that the LBmem lineages likely originate from clonal expansion of pre-existing memory, we further compared the contribution of positive (favoring new beneficial SHMs) and negative (preserving the current variant) selection between the LBmem and HBmem clusters. Since we observed only one clear example of an HBmem-LBmem transition (<xref ref-type="fig" rid="fig4">Figure 4F</xref>, <xref ref-type="fig" rid="fig4s1">Figure 4—figure supplement 1</xref>), we could not claim with certainty that LBmem lineages always emerge from pre-existing HBmem lineages rather than from some other memory type. Still, we were able to study LBmem reactivation by comparing differences in substitution patterns at the origin of HBmem and LBmem clusters. We reasoned that the G-MRCA distance of an HBmem lineage contains mutations fixed by primary affinity maturation after the initial lineage activation. In contrast, the G-MRCA distance of an LBmem lineage contains both mutations arising during primary affinity maturation and subsequent changes occurring later in the evolution of the lineage. Differences in the characteristics of the G-MRCA mutations between clusters are therefore informative of the process prior to the observed expansion of LBmem lineages.</p><p>To assess selection at the origin of the HBmem and LBmem lineages, we measured the divergence of nonsynonymous sites relative to synonymous sites (i.e., the DnDs ratio). Classically, DnDs > 1 is interpreted as evidence for positive selection. However, DnDs > 1 is rare, because the signal of positive selection is usually swamped by that of negative selection. In the McDonald-Kreitman (MK) framework, positive selection is instead revealed by excessive nonsynonymous divergence relative to nonsynonymous polymorphism (i.e<italic>.,</italic> DnDs > PnPs; see Materials and methods and <xref ref-type="supplementary-material" rid="fig5sdata1">Figure 5—source data 1</xref> for examples), under the logic that advantageous changes contribute more to divergence than to polymorphism (<xref ref-type="bibr" rid="bib24">McDonald and Kreitman, 1991</xref>). The fraction of adaptive nonsynonymous substitutions (<italic>α</italic>) can then be estimated from this excess. We designed an MK-like analysis, comparing the relative frequencies of nonsynonymous and synonymous SHMs at the G-MRCA branch (equivalent to divergence in the MK test) to those in subsequent evolution of clonal lineages (equivalent to polymorphism in the MK test; <xref ref-type="fig" rid="fig5">Figure 5A</xref>, see Materials and methods).</p><fig id="fig5" position="float"><label>Figure 5.</label><caption><title>Signatures of positive and negative selection in HBmem and LBmem clusters.</title><p>(<bold>A</bold>) Schematic of the McDonald-Kreitman (MK) test and site frequency spectrum (SFS) concept. (<bold>B</bold>) MK estimate of the fraction of adaptive nonsynonymous changes (<italic>α</italic>) between germline and most recent common ancestor (MRCA) in HBmem and LBmem clonal lineages. Only lineages with nonzero G-MRCA distance are included. <italic>N</italic>=68 for HBmem, 49 for LBmem, see <xref ref-type="supplementary-material" rid="fig5sdata2">Figure 5—source data 2</xref>. (<bold>C</bold>) Comparison of mean pairwise <italic>πNπS</italic> of HBmem and LBmem lineages. (<bold>D</bold>) Averaged SFS for HBmem and LBmem clonal lineages. The two dashed lines correspond to <italic>f</italic>(<italic>x</italic>)=<italic>x</italic><sup>–1</sup>, which is the expected neutral SFS under Kingman’s coalescent model (Kingman 1982), and <italic>f</italic>(<italic>x</italic>)=<italic>x</italic><sup>–2</sup>. (<bold>E</bold>) Comparison of normalized <italic>πNπS</italic> for HBmem and LBmem clonal lineages in various SHM frequency bins. The number of polymorphisms in each bin is normalized to the overall number of polymorphisms in a corresponding clonal lineage. (<bold>F</bold>) Scheme summarizing features of HBmem and LBmem clonal lineages. Comparisons in B, C, and E were performed by two-sided Mann-Whitney test, with Bonferroni-Holm multiple testing correction in E. *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>, ****=p ≤ 10<sup>–4</sup>.</p><p><supplementary-material id="fig5sdata1"><label>Figure 5—source data 1.</label><caption><title>Examples of divergent and polymorphic sites as calculated for the McDonald-Kreitman test.</title></caption><media mimetype="application" mime-subtype="docx" xlink:href="elife-79254-fig5-data1-v3.docx"/></supplementary-material></p><p><supplementary-material id="fig5sdata2"><label>Figure 5—source data 2.</label><caption><title>MK test results under different inclusion criterion for clonal lineages from HBmem and LBmem clusters.</title></caption><media mimetype="application" mime-subtype="docx" xlink:href="elife-79254-fig5-data2-v3.docx"/></supplementary-material></p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/fig5.jpg"/></fig><p>In both the HBmem and LBmem clonal lineages, we observed a higher ratio of nonsynonymous to synonymous SHMs in the G-MRCA branches compared to subsequent tree branches, meaning that a fraction of SHMs acquired by MRCA was further fixed by positive selection. However, this fraction was higher in LBmem lineages (Fisher’s exact test: <italic>α</italic>=0.58 and 0.65, p<10<sup>–6</sup> and <10<sup>–15</sup> in HBmem and LBmem, respectively). <italic>α</italic> of distinct clonal lineages was also generally higher in LBmem than in HBmem (median <italic>α</italic>=0.57 vs. <italic>α</italic>=0.18, <xref ref-type="fig" rid="fig5">Figure 5B</xref>), showing that positive selection more frequently preceded expansion of LBmem than HBmem lineages. The observation of excess <italic>α</italic> in the LBmem cluster compared to HBmem was robust to the peculiarities of the MK analysis (<xref ref-type="supplementary-material" rid="fig5sdata2">Figure 5—source data 2</xref>). The higher <italic>α</italic> for LBmem compared to HBmem implies that a larger fraction of SHMs was positively selected in LBmem clonal lineages before their expansion. This excess of advantageous SHMs in ancestors of LBmem lineages together with previous observations that LBmem lineages can originate from reactivated memory suggests that reactivation was coupled with new rounds of affinity maturation.</p></sec><sec id="s2-7"><title>Subsequent evolution of LBmem clonal lineages is affected by negative and positive selection</title><p>Next, we considered the effects of selection on HBmem and LBmem clusters following their divergence from their MRCAs that is, in the subsequent evolution of a clonal lineage leading to the diversity of the observed clonotypes. We calculated the per-site ratio of nonsynonymous and synonymous SHMs (<italic>πNπS</italic>) among those that originated after the MRCA. The <italic>πNπS</italic> of both clusters was <1 (<xref ref-type="fig" rid="fig5">Figure 5C</xref>). This deficit of nonsynonymous SHMs indicates negative selection in the observed part of clonal lineage evolution. The <italic>πNπS</italic> ratio was lower in the LBmem cluster, indicating stronger negative selection.</p><p>To examine the selection affecting these post-MRCA SHMs in more detail, we studied the frequency distribution of SHMs in individual lineages, or their site frequency spectra (SFSs) (<xref ref-type="bibr" rid="bib29">Nielsen, 2005</xref>; <xref ref-type="bibr" rid="bib26">Neher and Hallatschek, 2013</xref>; <xref ref-type="bibr" rid="bib28">Nei and Kumar, 2000</xref>; <xref ref-type="bibr" rid="bib21">Horns et al., 2019</xref>; <xref ref-type="bibr" rid="bib31">Nourmohammad et al., 2019</xref>; <xref ref-type="fig" rid="fig5">Figure 5A</xref>). SFS reflects the effect of selection on these SHMs. Deleterious SHMs are held back by negative selection, so that their frequency in the lineage remains low. By contrast, positive selection favors the spread of adaptive SHMs, increasing their frequency. Therefore, negative selection biases the SFS toward low frequencies, and positive selection, toward high frequencies. For each clonal lineage, we reconstructed the SFS of the SHMs accumulated since divergence from MRCA (<xref ref-type="fig" rid="fig5">Figure 5A</xref>), and then averaged these SFSs within the HBmem and LBmem clusters. A larger proportion of the LBmem SFS corresponds to high frequencies compared to HBmem (<xref ref-type="fig" rid="fig5">Figure 5D</xref>), indicating weaker negative and/or stronger positive selection in LBmem SFS.</p><p>To distinguish between these selection types, we calculated the proportion of the SFS distribution falling into each frequency bin for nonsynonymous SHMs, and divided it by the same value for synonymous SHMs (normalized <italic>πNπS</italic>; see Materials and methods, <xref ref-type="fig" rid="fig5">Figure 5E</xref>). The inter-cluster differences in the normalized <italic>πNπS</italic> in low-frequency bins were generally reflective of negative selection, while the differences in the high-frequency bins were reflective of positive selection. Normalized <italic>πNπS</italic> was significantly higher in the high-frequency (>60%) bins of SHMs in LBmem clonal lineages. This indicates that for LBmem, those nonsynonymous changes that were not removed by negative selection reached high frequencies more often than in HBmem. In total, these data indicate that a fraction of nonsynonymous mutations accumulated by LBmem lineages were adaptive. We thus observed that reactivation of LBmem lineages is coupled with strengthening of both types of selection: positive on the G-MRCA branch, and both positive and negative during subsequent clonal lineage expansion. This pattern is most likely evidence of new rounds of affinity maturation, which result in the acquisition of new advantageous changes and preserve the resulting BCRs from deleterious ones. HBmem, in contrast, evolved more neutrally under weaker negative selection, suggesting absence of antigen challenge during the observation period (<xref ref-type="fig" rid="fig5">Figure 5F</xref>).</p></sec></sec><sec id="s3" sec-type="discussion"><title>Discussion</title><p>Using advanced library preparation technology, we performed a longitudinal study of BCR repertoires of the three main antigen-experienced B cell subsets – memory B cells, plasmablasts, and plasma cells – from peripheral blood of six donors, sampled three times over the course of a year. We analyzed these repertoires from two conceptually different but complementary points of view. First, we compared various repertoire features between the cell subsets, including clonotype stability in time and convergence between individuals. Second, we tracked the most abundant B cell clonal lineages in time and analyzed their cell subset and isotype composition, phylogenetic history, and mode of selection.</p><p>Comparative analysis of the cell subsets revealed significant differences in IGH isotype distribution, rate of SHM, and CDR3 length. IgM clonotypes predominated in the Bmem subset, whereas in ASCs the switched isotypes IgA and IgG together represented >80% of repertoire diversity on average. As expected, classical switched isotypes have higher rates of SHM, and the rate of SHM in ASCs is in general higher than in Bmem. The IgD isotype in Bmem cells showed similarities to IgM, where most IgD clonotypes had low levels of SHM, although there was a fraction of heavily mutated clonotypes. On average, IgD-switched PL and PBL had a comparable level of SHM with IgG- and IgA-expressing ASC clonotypes. Notably, the level of SHM and CDR3 length in PBL on average exceeded that of PL in IgM, IgA, and IgG isotypes. We hypothesize that such PBLs with heavily hypermutated BCRs could be the subset of B cell progeny that continue to acquire mutations after optimal affinity has been achieved, while another part of the clonal progeny is committed to a long-lived PL fate and acquires the CD138 marker characteristic of this cell subset (<xref ref-type="bibr" rid="bib14">Garimalla et al., 2019</xref>).</p><p>While different in many aspects, immune-experienced B cell subsets are similar – and concordantly distinct from naive B cells – in terms of IGHV gene segment usage. Moreover, we observed that the correlated enrichment or depletion in V segment usage frequency generally coincides with the level of sequence similarity of the V segments. Most IGHV-3 family members were observed more frequently in antigen-experienced B cells compared to naive subsets in all donors and time points, while most of the other V genes that are well represented in the naive subset decreased in frequency. These differences in V usage frequencies between naive and antigen-experienced B cell subsets have also been reported in several previous studies, even though different FACS gating strategies were used (<xref ref-type="bibr" rid="bib25">Mitsunaga and Snyder, 2020</xref>; <xref ref-type="bibr" rid="bib15">Ghraichy et al., 2021</xref>). Our findings further support the idea that initial recruitment of B cells to the immune response is in many cases determined by the germline-encoded parts of the BCR, presumably CDR1 and CDR2. Previous studies have shown high levels of convergence in IGHV usage between B cell clonotypes specific for particular pathogens or self-antigens (<xref ref-type="bibr" rid="bib32">Peng et al., 2019</xref>; <xref ref-type="bibr" rid="bib13">Galson et al., 2015</xref>; <xref ref-type="bibr" rid="bib2">Bashford-Rogers et al., 2019</xref>).</p><p>We further analyzed the repertoire similarity of cell subsets over time and between individuals. Intuitively, the Bmem subset is the most stable over time, showing less repertoire divergence and a greater number of shared clonotypes between sampling time points in the same individuals. Our finding expands the recent observation of Bmem subset stability in elderly donors (<xref ref-type="bibr" rid="bib34">Phad et al., 2022</xref>) on a larger cohort of donors of younger age. Compared to intra-individual sharing, we detected a very small number of common clonotypes in Bmem cells. Those clonotypes have comparable levels of SHM to private ones, assuming a germinal center-dependent origin. Two recent studies on extra-deep repertoires of bulk peripheral blood B cells reported 1–6% (<xref ref-type="bibr" rid="bib42">Soto et al., 2019</xref>) or ~1% (<xref ref-type="bibr" rid="bib5">Briney et al., 2019</xref>) shared V-CDR3aa-J clonotypes between pairs of unrelated donors, with lower repertoire convergence for class-switched clonotypes shown in the latter study. Using the same method, we similarly measured 0.06% repertoire overlap in the Bmem subset (<xref ref-type="fig" rid="fig2s1">Figure 2—figure supplement 1D</xref>). Complementing the model proposed by Briney et al. – wherein IGH repertoires are initially dissimilar and then homogenize during B cell development before finally becoming highly individualized after immunological exposure – we found a significantly higher number of shared clonotypes between IGH repertoires among the most abundant Bmem clonotypes, indicating functional convergence presumably due to exposure to common environmental antigens. The latter is further supported by the higher number of persisting Bmem clonotypes observed among public clonotypes compared to private ones.</p><p>Next, we focused on the most abundant B cell clonal lineages, which are large enough to study the interconnection between cell subsets and phylogenetic features of lineages. In all individuals, the observed clonal lineages clearly fell into two clusters. HBmem represents persistent memory with a predominant IgM isotype; such clonal lineages were equally sampled from all time points and rarely included ASC clonotypes. The MRCA of observed clonotypes in HBmem lineages almost matched the predicted germline sequence – and in 14.5% of the lineages, matched completely – indicating that the probability of observing a clonotype from these lineages has no association with the position in that lineage’s phylogeny. Horns and colleagues observed lineages with very similar features to HBmem, which also possessed persistent dynamics against a background of vaccine-responsive lineages and were predominantly composed of the IgM isotype (<xref ref-type="bibr" rid="bib21">Horns et al., 2019</xref>). However, their study was performed on bulk B cells, so there was no possibility to track their relatedness to the Bmem subset. In contrast, the LBmem cluster demonstrates completely different features, with lineages largely composed of ASC clonotypes with switched IgA or IgG isotypes, showing active involvement in ongoing immune response. The MRCA of LBmem lineages differed from the germline sequence by some number of SHMs, and only 1.9% of LBmem lineages had a complete match between the MRCA and the germline sequence. A large G-MRCA distance implies that the observed clonotypes originated from an already-hypermutated ancestor, and that we had therefore sampled clonotypes from a single clade of the lineage phylogeny. Such an effect can be caused by both rapid expansion of the clade and migration of the clade’s clonotypes, diverged in the tissue of residence (<xref ref-type="bibr" rid="bib23">Mandric et al., 2020</xref>). We also observed that most LBmem lineages expanded at T2 or T3 (38 out of 45, >80%) had at least one clonotype detected in the Bmem subset at the previous time point, leading us to conclude that LBmems represent the progeny of reactivated Bmem cells. We found one clear example that further supports this idea: a lineage that possesses all features of the HBmem cluster except for one monophyletic clade, typical for LBmem lineage. This example of HBmem-LBmem transition is very similar to reactivated persistent memory, as observed by Hoehn et al. in response to seasonal flu vaccination (<xref ref-type="bibr" rid="bib20">Hoehn et al., 2021</xref>). In addition, Phad et al. have recently demonstrated clonal relatedness of the emerging PBL to the persistent Bmem lineages in longitudinal immune repertoire profiling of aged healthy donors (<xref ref-type="bibr" rid="bib34">Phad et al., 2022</xref>). Thus, it can be assumed that at least part of the observed LBmem lineages are the progeny of the persistent memory represented by HBmem lineages.</p><p>Our analysis of the selection mode in HBmem and LBmem lineages supported our assumptions. We showed that both lineages experienced positive selection from the germline sequence to the MRCA of the observed clonotypes – as expected, assuming that primary B cell activation is followed by affinity maturation associated with clonal lineage expansion. However, the pressure of positive selection was stronger in LBmem lineages than in HBmem. In addition, we detected an excess of sites under positive selection in LBmem lineages that underwent evolution after the MRCA as well. This leads us to the hypothesis that LBmem cells underwent additional rounds of affinity maturation after reactivation. Hoehn et al. did not study the mode of selection in their reactivated lineages, but some clonotypes were sampled from germinal centers, supporting the involvement of affinity maturation in the process of memory reactivation. In subsequent evolution after the MRCA, we detected negative selection in both types of lineages – but again, stronger in LBmem. This excessive negative selection in LBmem lineages can be considered as a signature of purification of the clonal lineage from deleterious BCR variants during affinity maturation.</p><p>In the present study we focused on the three major antigen-experienced cell subsets defined by the set of cell surface markers, while further validation using more advanced techniques for cell phenotyping (e.g., scRNA-seq) is desirable. Peripheral blood as the source of cell samples excludes from the analysis the cells resident in different tissues, such as bone marrow niches, the main source of long-living PL cells in humans. Also, we focused our analysis on the most expanded clonotypes and clonal lineages. These aspects limit our findings to some extent, meaning that it can reflect only a part of the whole complex picture of B cell immunity behavior in a normal state. The number of donors we studied was relatively small and the cohort was combined from donors different in allergy status. Nevertheless, the whole dataset was large enough to reveal that all our observations are significant and stay reproducible among donors independent from their health conditions. We observed no evidence that allergy status affects the structure of our data (Appendix 1), which allowed us to generalize obtained observations for the whole cohort group. We have not observed much direct evidence of the process of memory reactivation and new rounds of affinity maturations. Reactivation process was clearly detected in only one clonal lineage (<xref ref-type="fig" rid="fig4">Figure 4F</xref>). However, this explanation of the given data is convincing because of the whole set of indirect evidence, such as large G-MRCA distance and close relatedness of LBmem clonotypes, the presence of Bmem clonotypes prior to LBmem expansion, and different modes of natural selection in HBmem and LBmem clusters. Our hypothesis is also supported by recent studies (<xref ref-type="bibr" rid="bib20">Hoehn et al., 2021</xref>; <xref ref-type="bibr" rid="bib34">Phad et al., 2022</xref>).</p><p>Thus, in this work, we performed a detailed longitudinal analysis of BCR repertoires from immune-experienced B cell subsets from donors without severe pathologies, and from these data, we have produced a framework for the comprehensive analysis of selection in BCR clonal lineages. Our results demonstrate the interconnection of B cell subsets at a clonal level, B cell memory convergence in unrelated donors, and the long-term persistence of memory-enriched clonal lineages in peripheral blood. Signs of positive selection were detected in both memory- and ASC-dominated B cell lineages. Together, the results of our evolutionary analysis of B cell clonal lineages coupled with B cell subset annotation suggest that the reactivation of pre-existing memory B cells is accompanied by new rounds of affinity maturation.</p></sec><sec id="s4" sec-type="materials|methods"><title>Materials and methods</title><table-wrap id="keyresource" position="anchor"><label>Key resources table</label><table frame="hsides" rules="groups"><thead><tr><th align="left" valign="bottom">Reagent type (species) or resource</th><th align="left" valign="bottom">Designation</th><th align="left" valign="bottom">Source or reference</th><th align="left" valign="bottom">Identifiers</th><th align="left" valign="bottom">Additional information</th></tr></thead><tbody><tr><td align="left" valign="bottom">Antibody</td><td align="left" valign="bottom">Anti-CD19-APC<break/>(mouse monoclonal)</td><td align="left" valign="bottom">Miltenyi Biotec</td><td align="left" valign="bottom">clone: LT19, cat. #:130-091-248</td><td align="left" valign="bottom"> FACS (2 µl per test)</td></tr><tr><td align="left" valign="bottom">Antibody</td><td align="left" valign="bottom">Anti-CD20-VioBlue (mouse monoclonal)</td><td align="left" valign="bottom">Miltenyi Biotec</td><td align="left" valign="bottom">clone: LT20, cat. #:130-094-167</td><td align="left" valign="bottom"> FACS (2 µl per test)</td></tr><tr><td align="left" valign="bottom">Antibody</td><td align="left" valign="bottom">Anti-CD27-VioBright FITC (mouse monoclonal)</td><td align="left" valign="bottom">Miltenyi Biotec</td><td align="left" valign="bottom">clone: M-T271, cat. #:130-104-845</td><td align="left" valign="bottom"> FACS (2 µl per test)</td></tr><tr><td align="left" valign="bottom">Antibody</td><td align="left" valign="bottom">Anti-CD138-PE-Vio770 (mouse monoclonal)</td><td align="left" valign="bottom">Miltenyi Biotec</td><td align="left" valign="bottom">clone: 44F9, cat. #:130-099-292</td><td align="left" valign="bottom"> FACS (2 µl per test)</td></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">MIGEC</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib41">Shugay et al., 2014</xref></td><td align="left" valign="bottom">v1.2.7</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">MiXCR</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib3">Bolotin et al., 2015</xref></td><td align="left" valign="bottom">v3.0.10</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">TIgGER</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib11">Gadala-Maria et al., 2015</xref></td><td align="left" valign="bottom">v3.0.10</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">Change-O</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib19">Gupta et al., 2015</xref></td><td align="left" valign="bottom">v0.4.4</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">edgeR</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib36">Robinson et al., 2010</xref></td><td align="left" valign="bottom">v0.4.4</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">MUSCLE</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib10">Edgar, 2004</xref></td><td align="left" valign="bottom">v3.8.31</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">RAxML</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib43">Stamatakis, 2014</xref></td><td align="left" valign="bottom">v8.2.11</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">R language</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib35">R Development Core Team, 2018</xref></td><td align="left" valign="bottom">v4.0.0</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">ggplot2</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib17">Ginestet, 2011</xref></td><td align="left" valign="bottom">v3.3.2</td><td align="left" valign="bottom"/></tr><tr><td align="left" valign="bottom">Software, algorithm</td><td align="left" valign="bottom">ggtree</td><td align="left" valign="bottom"><xref ref-type="bibr" rid="bib49">Yu et al., 2017</xref></td><td align="left" valign="bottom">v2.2.4</td><td align="left" valign="bottom"/></tr></tbody></table></table-wrap><sec id="s4-1"><title>Donors, cells, and time points</title><p>Blood samples from six (four males and two females) young and middle-aged donors (23, 27, 27, 33, 33, and 39 years of age) without severe inflammatory diseases, chronic or recent acute infectious diseases, or vaccinations were collected at three time points (T1 – 0, T2 – 1 month, T3 – 12 months); donor details and the number cells collected for each time point and cell subset are provided in <xref ref-type="table" rid="table1">Table 1</xref>. Four donors suffered allergic rhinitis to pollen, and two also suffered from food allergy. Informed consent was obtained from each donor. The study was approved by the Local Ethical Committee of Pirogov Russian National Research Medical University, Moscow, Russia (abstract #190 November 18, 2019). At each time point, 18–22 ml of peripheral blood was collected in BD Vacuette tubes with EDTA. PBMCs were isolated using Ficoll gradient density centrifugation. To isolate subpopulations of interest, cells were stained with anti-CD19-APC, anti-CD20-VioBlue, anti-CD27-VioBright FITC, and anti-CD138-PE-Vio770 (all Miltenyi Biotec) in the presence of FcR Blocking Reagent (Miltenyi Biotec) according to the manufacturer’s protocol, and then sorted using FACS (BD FacsAria III, BD Biosciences) into the following populations: Bmem cells (CD19<sup>+</sup> CD20<sup>+</sup> CD27<sup>+</sup> CD138<sup>-</sup>), PBL (CD20<sup>–</sup> CD19<sup>Low/+</sup> CD27<sup>++</sup> CD138<sup>–</sup>), PL cells (CD20<sup>–</sup> CD19 <sup>Low/+</sup> CD27<sup>++</sup> CD138<sup>+</sup>). For each donor at T1, one replicate sample of each cell subpopulation was collected. At T2 and T3, two replicate samples were collected (<inline-formula><mml:math id="inf3"><mml:mn>50</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> to <inline-formula><mml:math id="inf4"><mml:mn>100</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> Bmem, <inline-formula><mml:math id="inf5"><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> to <inline-formula><mml:math id="inf6"><mml:mn>2</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> PBL, <inline-formula><mml:math id="inf7"><mml:mn>0.5</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> to <inline-formula><mml:math id="inf8"><mml:mn>1</mml:mn><mml:mo>×</mml:mo><mml:msup><mml:mrow><mml:mn>10</mml:mn></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> PL per sample).</p></sec><sec id="s4-2"><title>IGH cDNA libraries and sequencing</title><p>IGH cDNA libraries were prepared as described previously (<xref ref-type="bibr" rid="bib45">Turchaninova et al., 2016</xref>) with several modifications. Briefly, we used a rapid amplification of cDNA ends (RACE) approach with a template-switch effect to introduce 5’ adaptors during cDNA synthesis. These adaptors contained both UMIs, allowing error correction, and sample barcodes (described in <xref ref-type="bibr" rid="bib50">Zvyagin et al., 2017</xref>), allowing us to rule out potential cross-sample contaminations. In addition to a universal sequence for annealing the forward PCR primer, we also introduced a 5’ adaptor during the reverse transcription (RT) reaction, which allowed us to avoid using multiplexed forward primers specific for V segments, thereby reducing PCR amplification biases. Multiplexed C-segment-specific primers were used for RT and PCR, allowing us to preserve isotype information. Prepared libraries were then sequenced with an Illumina HiSeq 2000/2500 (paired-end, 2×310 bp).</p></sec><sec id="s4-3"><title>Sequencing data pre-processing and repertoire reconstruction</title><p>Sample demultiplexing by sample barcodes introduced in the 5’ adapter and UMI-based error correction were performed using MIGEC <ext-link ext-link-type="uri" xlink:href="https://www.zotero.org/google-docs/?GKc50C">v1.2.7</ext-link> software (<xref ref-type="bibr" rid="bib41">Shugay et al., 2014</xref>). For further analysis, we used sequences covered by at least two sequencing reads. Alignment of sequences, V-, D-, J-, and C-segment annotation, and reconstruction of clonal repertoires were accomplished using MiXCR v3.0.10 (<xref ref-type="bibr" rid="bib3">Bolotin et al., 2015</xref>) with prior removal of the primer-originated component of the C-segment. We defined clonotypes as a unique IGH nucleotide sequence starting from the framework 1 region of the V segment to the end of the J segment, and taking into account isotype. Using TIgGER (<xref ref-type="bibr" rid="bib11">Gadala-Maria et al., 2015</xref>) software, we derived an individual database of V gene alleles for each donor and realigned all sequences for precise detection of hypermutations. For analysis of general repertoire characteristics (isotype frequencies, SHM levels, CDR3 length, IGHV gene usage, and repertoire similarity metrics), we used samples covered by at least 0.1 cDNA molecules per cell for Bmem, and at least five cDNA per cell for PBL and PL.</p></sec><sec id="s4-4"><title>Repertoire characteristics analysis</title><p>Isotype frequencies, rate of SHM, and CDR3 lengths were determined using MiXCR v3.0.10 (<xref ref-type="bibr" rid="bib3">Bolotin et al., 2015</xref>). For calculation of background IGHV gene segment usage and number of shared clonotypes, we utilized data derived from <xref ref-type="bibr" rid="bib16">Gidoni et al., 2019</xref> (European Nucleotide Archive accession number ERP108501) representing naive B cell IGH repertoires, where the IGH cDNA libraries were prepared using 5’-RACE-based protocol similar to the protocol used in the current study.</p><p>We used repertoires containing more than 5000 clonotypes and processed them in the same way as our data. IGHV gene frequencies were calculated as the number of unique clonotypes to which a particular IGHV gene was annotated by MiXCR divided by the total number of clonotypes identified in the sample. To assess IGHV gene segments over- and under-represented in studied subsets, we utilized edgeR package v0.4.4 (<xref ref-type="bibr" rid="bib36">Robinson et al., 2010</xref>) with the ‘trended’ dispersion model using trimmed mean of M values method for normalization (<xref ref-type="bibr" rid="bib37">Robinson and Oshlack, 2010</xref>). To evaluate pairwise similarity between repertoires based on IGHV gene segment frequency distributions, we utilized Jensen-Shannon divergence, calculated using the following formula:<disp-formula id="equ1"><mml:math id="m1"><mml:mrow><mml:mi>J</mml:mi><mml:mi>S</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:mi>P</mml:mi><mml:mo>,</mml:mo><mml:mi>Q</mml:mi><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>−</mml:mo><mml:munder><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:msub><mml:mi>g</mml:mi><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>p</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>q</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula></p><p>where <italic>P</italic> and <italic>Q</italic> represent distributions of IGHV gene segment in two repertoires, and <italic>p</italic><sub><italic>i</italic></sub> and <italic>q</italic><sub><italic>i</italic></sub> represent frequencies of individual member <italic>i</italic> (IGHV gene segment). In silico repertoires used for the calculation of background clonal overlap (each repertoire contained 5000 clonotypes) were generated with OLGA software v1.0.2 (<xref ref-type="bibr" rid="bib39">Sethna et al., 2019</xref>) under standard settings utilizing the built-in model. For clonal overlap calculation, we downsized repertoires to a fixed number of clonotypes. For <xref ref-type="fig" rid="fig1">Figure 1B</xref>, the 14,000 most abundant clonotypes were considered in Bmem, 600 in PBL, and 300 in PL. For <xref ref-type="fig" rid="fig1">Figure 1C</xref>, we considered 5000 clonotypes for all cell subsets. Clonotypes with identical CDR3 amino acid sequence and the same IGHV gene segment detected in both analyzed samples were considered shared. Clonotypes shared between repertoires of at least two individuals were termed as public.</p></sec><sec id="s4-5"><title>Assignment of clonal lineages</title><p>Change-O v0.4.4 (<xref ref-type="bibr" rid="bib19">Gupta et al., 2015</xref>) was utilized to assign clonal groups, defined as groups of clonotypes with the same V segment, CDR3 length, and at least 85% similarity in CDR3 nucleotide sequence. Before clonal group assignment, we excluded all clonotypes with counts equal to 1. Clonal groups represent observed subsets of clonal lineages originating from a single BCR ancestor, so for simplicity, we use the term ‘clonal lineages’. To study evolutionary dynamics of clonal lineages, we joined all replicas, three time points (T1, T2, and T3), and cell subsets for each patient into a single dataset and excluded clonotypes that were presented by a single UMI. Phylogenetic analysis was performed on four patients for whom we had samples at all time points, and on clonal lineages containing at least 20 unique clonotypes as in <xref ref-type="bibr" rid="bib31">Nourmohammad et al., 2019</xref>.</p></sec><sec id="s4-6"><title>Clusterization of clonal lineages in HBmem and LBmem clusters</title><p>We performed principal component analysis on six scaled variables of clonal lineage composition: fractions of Bmem, PBL, and PL, and fractions of IgM, IgG, and IgA. The IgE isotype was not detected in clonal lineages involved in phylogenetic analysis, so we did not include it as a variable. HBmem and LBmem clusters were defined using the <italic>k</italic>-means clustering algorithm.</p></sec><sec id="s4-7"><title>Metric of persistence of clonal lineages</title><p>We estimated the frequency of a clonal lineage in the repertoire at a given time point as the ratio of the number of unique clonotypes in the clonal lineage detected at this time point to the overall number of unique clonotypes detected at this time point. If the clonal lineage was not detected at some time point, we assigned its frequency to pseudocount, as it would be a single clonotype detected from this time point. To estimate persistence of clonal lineage frequency in the repertoire over time, we defined the persistence metric:<disp-formula id="equ2"><mml:math id="m2"><mml:mrow><mml:mi>P</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>+</mml:mo><mml:mfrac><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>f</mml:mi><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula></p><p>where <inline-formula><mml:math id="inf9"><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi><mml:mi>a</mml:mi><mml:mi>x</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> is the maximum frequency of the clonal lineage in the three time points and <inline-formula><mml:math id="inf10"><mml:msub><mml:mrow><mml:mi>f</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are its frequencies in the other two (<xref ref-type="fig" rid="fig3">Figure 3D</xref>). Persistence is equal to 1 if the frequency remains consistent at all three time points. If a clonal lineage was detected just once in the experiment and frequencies at other two time points were assigned to pseudocounts, the persistence approaches zero.</p></sec><sec id="s4-8"><title>Reconstruction of clonal lineage germline sequence</title><p>We used MiXCR-derived reference V, D, and J segment sequences to reconstruct IGH germline sequences for each clonal lineage, concatenating only those sequence fragments which were present at CDR3 junctions of original MiXCR-defined clonotypes. Thus, random nucleotide insertions were disregarded, making them appear as gaps in the alignment of lineage clonotypes with the germline sequence. We excluded them from all parts of the phylogenetic analysis where germline sequence was required.</p></sec><sec id="s4-9"><title>Reconstruction of clonal lineage phylogeny and MRCA</title><p>For phylogenetic analysis of clonal lineages, we aligned clonotypes with reconstructed germline sequences using MUSCLE v3.8.31 with 400 gap open penalty (<xref ref-type="bibr" rid="bib10">Edgar, 2004</xref>). Next, we reconstructed the clonal lineage’s phylogeny with RAxML v8.2.11, using the GTRGAMMA evolutionary model and germline sequence as an outgroup, and computed marginal ancestral states (<xref ref-type="bibr" rid="bib43">Stamatakis, 2014</xref>). The ancestral sequence of the node closest to the root of the tree, represented by the germline sequence, is the MRCA of the sampled clonotypes. It can match the germline sequence or differ by some amount due to SHM, reflecting the starting point of subsequent evolution of observed clonotypes. This allowed us to distinguish between SHMs fixed in the clonal lineage on the way from the germline sequence to the MRCA (G-MRCA SHMs) vs. polymorphisms within the observed part of lineage. The G-MRCA p-distance in <xref ref-type="fig" rid="fig4">Figure 4B</xref> was measured as a fraction of diverged positions between germline and MRCA sequences.</p></sec><sec id="s4-10"><title>MK test</title><p>The MK test is designed to detect the effects of positive or negative selection on population divergence from another species or its ancestral state (<xref ref-type="bibr" rid="bib24">McDonald and Kreitman, 1991</xref>). It is based on the comparison of ratios of nonsynonymous to synonymous substitutions observed in diverged and polymorphic sites, and estimates the fraction of diverged amino acid substitutions fixed by positive selection:<disp-formula id="equ3"><mml:math id="m3"><mml:mrow><mml:mi>α</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mfrac><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>⋅</mml:mo><mml:mfrac><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula></p><p>where <inline-formula><mml:math id="inf11"><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf12"><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> respectively represent nonsynonymous and synonymous polymorphisms, and <inline-formula><mml:math id="inf13"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf14"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> respectively represent nonsynonymous and synonymous divergences fixed in the population. Under neutral evolution, nonsynonymous and synonymous changes are equally likely to be fixed or appear in the population as polymorphisms, so <inline-formula><mml:math id="inf15"><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:math></inline-formula> and <italic>α</italic>=0. Positive selection favors adaptive nonsynonymous changes to be fixed, and increases <inline-formula><mml:math id="inf16"><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:math></inline-formula> relative to <inline-formula><mml:math id="inf17"><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:math></inline-formula> , resulting in <italic>α</italic>>0. Negative selection has the opposite effect and produces <italic>α</italic><0.</p><p>To detect selection in the origin of clonal lineages, we considered G-MRCA SHM as divergent changes, and the remaining SHM in a clonal lineage after the MRCA as polymorphic ones (<xref ref-type="fig" rid="fig5">Figure 5A</xref>). If we observed different nucleotides in the germline sequence and MRCA at a site that was also polymorphic, we considered it as divergent only if the germline variant was not among the polymorphisms (<xref ref-type="supplementary-material" rid="fig5sdata1">Figure 5—source data 1</xref>, examples of codons <italic>q</italic> and <italic>r</italic>). Codons with unknown germline state were excluded from the MK test (<xref ref-type="supplementary-material" rid="fig5sdata1">Figure 5—source data 1</xref>, example of codon <italic>j</italic>). To perform the MK test on joined HBmem or LBmem cluster variation, we summed variation of all clonal lineages of the same cluster in each category (<inline-formula><mml:math id="inf18"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> <italic>,</italic> <inline-formula><mml:math id="inf19"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> <italic>,</italic> <inline-formula><mml:math id="inf20"><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>). Calculations of <italic>α</italic> of distinct clonal lineages for comparison of its distributions between two clusters were complicated by zero G-MRCA distance in some clonal lineages, mostly belonging to the HBmem cluster. We dealt with this using three approaches, presented in <xref ref-type="supplementary-material" rid="fig5sdata2">Figure 5—source data 2</xref>. In the first, we added pseudocounts to <inline-formula><mml:math id="inf21"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf22"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> in each clonal lineage, so that for clonal lineages with zero G-MRCA distance, <inline-formula><mml:math id="inf23"><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula>. In the second, we excluded clonal lineages with zero G-MRCA distance from the analysis, still adding pseudocounts to <inline-formula><mml:math id="inf24"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf25"><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> in each clonal lineage in cases where the G-MRCA distance consists of just one nonsynonymous or synonymous substitution. In the third, we compared only those clonal lineages that had at least one nonsynonymous and at least one synonymous substitution on the G-MRCA branch. We also calculated the MK test on joined variation for all types of exclusion criteria to check its robustness; however, there is no need to exclude clonal lineages in the case of the joined test (<xref ref-type="supplementary-material" rid="fig5sdata2">Figure 5—source data 2</xref>). In the first approach clonal lineages with zero G-MRCA distance always produced negative <italic>α</italic> and biased median <italic>α</italic> to negative values as well. Medians of <italic>α</italic> in the second and third approaches were more consistent with results of the test on joined variation. However, in the third approach, the filter excluded most of the HBmem cluster, and so in the main test we presented results of the second approach (<xref ref-type="fig" rid="fig5">Figure 5B</xref>). To check the significance of deviation of <italic>α</italic> from neutral expectations, we used an Fisher’s exact test as in the original MK pipeline (<xref ref-type="bibr" rid="bib24">McDonald and Kreitman, 1991</xref>).</p></sec><sec id="s4-11"><title>πNπS</title><p>To calculate <italic>πNπS</italic> we identified SHMs in each clonal lineage relative to the reconstructed MRCA sequence. In multiallelic sites (with multiple SHMs observed, see codon <italic>i</italic> in <xref ref-type="supplementary-material" rid="fig5sdata1">Figure 5—source data 1</xref> as an example) we considered each variant as an independent SHM event. <italic>πN</italic> and <italic>πS</italic> were calculated as the number of nonsynonymous and synonymous SHMs in a clonal lineage, normalized to the number of nonsynonymous and synonymous sites in the MRCA sequence, respectively. The resulting <italic>πNπS</italic> value is the ratio between <italic>πN</italic> and <italic>πS</italic>:<disp-formula id="equ4"><mml:math id="m4"><mml:mrow><mml:mi>π</mml:mi><mml:mi>N</mml:mi><mml:mi>π</mml:mi><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mi>N</mml:mi><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>:</mml:mo><mml:mfrac><mml:mi>S</mml:mi><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mfrac></mml:mrow></mml:math></disp-formula></p><p>where <inline-formula><mml:math id="inf26"><mml:mi>N</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="inf27"><mml:mi>S</mml:mi></mml:math></inline-formula> are the number of nonsynonymous or synonymous SHMs, respectively, observed in the clonal lineage and <inline-formula><mml:math id="inf28"><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf29"><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are the number of nonsynonymous or synonymous sites, respectively, in the MRCA sequence of the clonal lineage, calculated as in <xref ref-type="bibr" rid="bib27">Nei and Gojobori, 1986</xref>.</p></sec><sec id="s4-12"><title>Site frequency spectrum</title><p>SFS reflects the distribution of SHM frequencies in the clonal lineage. We calculated the frequency of each SHM as a number of unique clonotypes carrying the SHM relative to the overall number of unique clonotypes in the lineage. To visualize SFS, we binned SHM frequencies into 20 equal intervals from 0 to 1 with a step size of 0.05, and counted SHM density in each bin as the number of SHMs in a given frequency bin normalized to the overall number of SHMs detected in the lineage. To obtain the cluster average SFS, we took the mean of clonal lineages of the same cluster in each frequency bin.</p></sec><sec id="s4-13"><title>Normalized <italic>πNπS</italic> in bins of SHM frequencies</title><p>To compare ratios of nonsynonymous and synonymous SHMs of different frequencies between two clusters, we calculated normalized <italic>πNπS</italic> in bins of SHM frequency. For this purpose we used a smaller number of frequency bins (0; 0.2; 0.4; 0.6; 0.8; 1) to reduce the probability of bins without observed SHMs. To deal with the remaining empty bins, we added pseudocounts to nonsynonymous and synonymous SHMs in each frequency bin. Thus, normalized <italic>πNπS</italic> in the <italic>i</italic>th SHM frequency bin was calculated as follows:<disp-formula id="equ5"><mml:math id="m5"><mml:mrow><mml:mrow><mml:mi mathvariant="normal">n</mml:mi><mml:mi mathvariant="normal">o</mml:mi><mml:mi mathvariant="normal">r</mml:mi><mml:mi mathvariant="normal">m</mml:mi><mml:mi mathvariant="normal">a</mml:mi><mml:mi mathvariant="normal">l</mml:mi><mml:mi mathvariant="normal">i</mml:mi><mml:mi mathvariant="normal">z</mml:mi><mml:mi mathvariant="normal">e</mml:mi><mml:mi mathvariant="normal">d</mml:mi></mml:mrow><mml:mspace width="thinmathspace"/><mml:mspace width="thinmathspace"/><mml:mi>π</mml:mi><mml:mi>N</mml:mi><mml:mi>π</mml:mi><mml:mi>S</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:munderover><mml:mo movablelimits="false">∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>5</mml:mn></mml:mrow></mml:munderover><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mn>5</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mfrac></mml:mrow></mml:math></disp-formula></p><p>where <inline-formula><mml:math id="inf30"><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf31"><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are the number of nonsynonymous and synonymous SHMs, respectively, in the <italic>i</italic>th frequency bin, <inline-formula><mml:math id="inf32"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>5</mml:mn></mml:mrow></mml:msubsup><mml:mrow/></mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf33"><mml:mrow><mml:msubsup><mml:mo>∑</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>5</mml:mn></mml:mrow></mml:msubsup><mml:mrow/></mml:mrow><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are respectively the overall number of nonsynonymous and synonymous SHMs observed in the clonal lineage (the sum of SHMs in all frequency bins), and <inline-formula><mml:math id="inf34"><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> and <inline-formula><mml:math id="inf35"><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> are the number of nonsynonymous and synonymous sites respectively in the MRCA sequence of the clonal lineage, calculated as in <xref ref-type="bibr" rid="bib27">Nei and Gojobori, 1986</xref>. To compare distributions of normalized <italic>πNπS</italic> between two clusters of clonal lineages in the five frequency bins, we used the Mann-Whitney test with Bonferroni-Holm multiple testing correction.</p></sec></sec></body><back><sec sec-type="additional-information" id="s5"><title>Additional information</title><fn-group content-type="competing-interest"><title>Competing interests</title><fn fn-type="COI-statement" id="conf1"><p>No competing interests declared</p></fn><fn fn-type="COI-statement" id="conf2"><p>No competing interests declared</p></fn></fn-group><fn-group content-type="author-contribution"><title>Author contributions</title><fn fn-type="con" id="con1"><p>Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing, Participated in study design, Collected cell samples and prepared cDNA libraries, Performed repertoire data processing and analysis</p></fn><fn fn-type="con" id="con2"><p>Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing – original draft, Writing – review and editing, Designed and performed the evolutionary analysis of clonal lineages</p></fn><fn fn-type="con" id="con3"><p>Writing – review and editing, Contributed to FACS cell sorting experimental design, Assisted with FACS cell sorting experiments</p></fn><fn fn-type="con" id="con4"><p>Contributed to FACS cell sorting experimental design, Assisted with FACS cell sorting experiments</p></fn><fn fn-type="con" id="con5"><p>Writing – review and editing, Provided advisory support and contributed to the optimisation of IGH library preparation</p></fn><fn fn-type="con" id="con6"><p>Writing – review and editing, Provided advisory support in bioinformatic analysis and result interpretation</p></fn><fn fn-type="con" id="con7"><p>Resources, Funding acquisition, Methodology, Writing – review and editing, Provided advisory support in experimental design and results interpretation</p></fn><fn fn-type="con" id="con8"><p>Conceptualization, Methodology, Writing – original draft, Writing – review and editing, Designed evolutionary analysis of clonal lineages, Provided advisory support in results interpretation</p></fn><fn fn-type="con" id="con9"><p>Conceptualization, Resources, Data curation, Formal analysis, Supervision, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing – original draft, Project administration, Writing – review and editing, Designed the study, Provided advisory support and participated in cell sample collection</p></fn></fn-group><fn-group content-type="ethics-information"><title>Ethics</title><fn fn-type="other"><p>Human subjects: Informed consent was obtained from each donor. The study was approved by the Local Ethical Committee of Pirogov Russian National Research Medical University, Moscow, Russia (abstract #190 18 Nov 2019).</p></fn></fn-group></sec><sec sec-type="supplementary-material" id="s6"><title>Additional files</title><supplementary-material id="mdar"><label>MDAR checklist</label><media xlink:href="elife-79254-mdarchecklist1-v3.pdf" mimetype="application" mime-subtype="pdf"/></supplementary-material></sec><sec sec-type="data-availability" id="s7"><title>Data availability</title><p>Sequencing data have been deposited in the ArrayExpress database (www.ebi.ac.uk/arrayexpress, acc. num. E-MTAB-11193). The code for repertoire analysis is available at <ext-link ext-link-type="uri" xlink:href="https://github.com/amikelov/igh_subsets">https://github.com/amikelov/igh_subsets</ext-link>, (copy archived at <ext-link ext-link-type="uri" xlink:href="https://archive.softwareheritage.org/swh:1:dir:9f7235f5bf4b37e567f4ee846c678293408d8318;origin=https://github.com/amikelov/igh_subsets;visit=swh:1:snp:e77774b473d7898bd10c31886c8538560fe27d10;anchor=swh:1:rev:a5cd9753070e319c329ceb4aec8172020ea69138">swh:1:rev:a5cd9753070e319c329ceb4aec8172020ea69138</ext-link>); the code for clonal lineage analysis is available at <ext-link ext-link-type="uri" xlink:href="https://github.com/EvgeniiaAlekseeva/Clonal_group_analysis">https://github.com/EvgeniiaAlekseeva/Clonal_group_analysis</ext-link>, (copy archived at <ext-link ext-link-type="uri" xlink:href="https://archive.softwareheritage.org/swh:1:dir:936f7bfd9a4846fa6b9da064678c42d1c9e08f56;origin=https://github.com/EvgeniiaAlekseeva/Clonal_group_analysis;visit=swh:1:snp:23acf073891062f197bcd8db0d3cfbdc65f8535c;anchor=swh:1:rev:e14ff814643201ea8278fb51b0f118c869e2dfb9">swh:1:rev:e14ff814643201ea8278fb51b0f118c869e2dfb9</ext-link>).</p><p>The following dataset was generated:</p><p><element-citation publication-type="data" specific-use="isSupplementedBy" id="dataset1"><person-group person-group-type="author"><name><surname>Mikelov</surname><given-names>AI</given-names></name><name><surname>Alekseeva</surname><given-names>EI</given-names></name><name><surname>Komech</surname><given-names>EA</given-names></name><name><surname>Staroverov</surname><given-names>DB</given-names></name><name><surname>Turchaninova</surname><given-names>MA</given-names></name><name><surname>Shugay</surname><given-names>M</given-names></name><name><surname>Chudakov</surname><given-names>DM</given-names></name><name><surname>Bazykin</surname><given-names>GA</given-names></name><name><surname>Zvyagin</surname><given-names>IV</given-names></name></person-group><year iso-8601-date="2022">2022</year><data-title>Longitudinal full-length IGH repertoire profiling and clonal lineage dynamics in memory B cells, plasmablasts and plasma cells of human peripheral blood</data-title><source>ArrayExpress</source><pub-id pub-id-type="accession" xlink:href="https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-11193">E-MTAB-11193</pub-id></element-citation></p><p>The following previously published dataset was used:</p><p><element-citation publication-type="data" specific-use="references" id="dataset2"><person-group person-group-type="author"><name><surname>Gidoni</surname><given-names>M</given-names></name><name><surname>Snir</surname><given-names>O</given-names></name><name><surname>Peres</surname><given-names>A</given-names></name><name><surname>Polak</surname><given-names>P</given-names></name><name><surname>Lindeman</surname><given-names>I</given-names></name><name><surname>Mikocziova</surname><given-names>I</given-names></name><name><surname>Sarna</surname><given-names>VK</given-names></name><name><surname>Lundin</surname><given-names>KEA</given-names></name><name><surname>Clouser</surname><given-names>C</given-names></name><name><surname>Vigneault</surname><given-names>F</given-names></name><name><surname>Collins</surname><given-names>AM</given-names></name><name><surname>Sollid</surname><given-names>LM</given-names></name><name><surname>Yaari</surname><given-names>G</given-names></name></person-group><year iso-8601-date="2019">2019</year><data-title>Naive B-cell receptor heavy chain repertoire of celiac patients and healthy controls</data-title><source>European Nucleotide Archive</source><pub-id pub-id-type="accession" xlink:href="https://www.ebi.ac.uk/ena/browser/view/PRJEB26509">PRJEB26509</pub-id></element-citation></p></sec><ack id="ack"><title>Acknowledgements</title><p>We are grateful to our donors. We are grateful to Alexey Neverov for helpful discussion of inference of selection.</p></ack><ref-list><title>References</title><ref id="bib1"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Akkaya</surname><given-names>M</given-names></name><name><surname>Kwak</surname><given-names>K</given-names></name><name><surname>Pierce</surname><given-names>SK</given-names></name></person-group><year iso-8601-date="2020">2020</year><article-title>B cell memory: building two walls of protection against pathogens</article-title><source>Nature Reviews. Immunology</source><volume>20</volume><fpage>229</fpage><lpage>238</lpage><pub-id pub-id-type="doi">10.1038/s41577-019-0244-2</pub-id><pub-id pub-id-type="pmid">31836872</pub-id></element-citation></ref><ref id="bib2"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bashford-Rogers</surname><given-names>RJM</given-names></name><name><surname>Bergamaschi</surname><given-names>L</given-names></name><name><surname>McKinney</surname><given-names>EF</given-names></name><name><surname>Pombal</surname><given-names>DC</given-names></name><name><surname>Mescia</surname><given-names>F</given-names></name><name><surname>Lee</surname><given-names>JC</given-names></name><name><surname>Thomas</surname><given-names>DC</given-names></name><name><surname>Flint</surname><given-names>SM</given-names></name><name><surname>Kellam</surname><given-names>P</given-names></name><name><surname>Jayne</surname><given-names>DRW</given-names></name><name><surname>Lyons</surname><given-names>PA</given-names></name><name><surname>Smith</surname><given-names>KGC</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>Analysis of the B cell receptor repertoire in six immune-mediated diseases</article-title><source>Nature</source><volume>574</volume><fpage>122</fpage><lpage>126</lpage><pub-id pub-id-type="doi">10.1038/s41586-019-1595-3</pub-id><pub-id pub-id-type="pmid">31554970</pub-id></element-citation></ref><ref id="bib3"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bolotin</surname><given-names>DA</given-names></name><name><surname>Poslavsky</surname><given-names>S</given-names></name><name><surname>Mitrophanov</surname><given-names>I</given-names></name><name><surname>Shugay</surname><given-names>M</given-names></name><name><surname>Mamedov</surname><given-names>IZ</given-names></name><name><surname>Putintseva</surname><given-names>EV</given-names></name><name><surname>Chudakov</surname><given-names>DM</given-names></name></person-group><year iso-8601-date="2015">2015</year><article-title>MiXCR: software for comprehensive adaptive immunity profiling</article-title><source>Nature Methods</source><volume>12</volume><fpage>380</fpage><lpage>381</lpage><pub-id pub-id-type="doi">10.1038/nmeth.3364</pub-id><pub-id pub-id-type="pmid">25924071</pub-id></element-citation></ref><ref id="bib4"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Bonsignori</surname><given-names>M</given-names></name><name><surname>Liao</surname><given-names>HX</given-names></name><name><surname>Gao</surname><given-names>F</given-names></name><name><surname>Williams</surname><given-names>WB</given-names></name><name><surname>Alam</surname><given-names>SM</given-names></name><name><surname>Montefiori</surname><given-names>DC</given-names></name><name><surname>Haynes</surname><given-names>BF</given-names></name></person-group><year iso-8601-date="2017">2017</year><article-title>Antibody-virus co-evolution in HIV infection: paths for HIV vaccine development</article-title><source>Immunological Reviews</source><volume>275</volume><fpage>145</fpage><lpage>160</lpage><pub-id pub-id-type="doi">10.1111/imr.12509</pub-id><pub-id pub-id-type="pmid">28133802</pub-id></element-citation></ref><ref id="bib5"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Briney</surname><given-names>B</given-names></name><name><surname>Inderbitzin</surname><given-names>A</given-names></name><name><surname>Joyce</surname><given-names>C</given-names></name><name><surname>Burton</surname><given-names>DR</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>Commonality despite exceptional diversity in the baseline human antibody repertoire</article-title><source>Nature</source><volume>566</volume><fpage>393</fpage><lpage>397</lpage><pub-id pub-id-type="doi">10.1038/s41586-019-0879-y</pub-id><pub-id pub-id-type="pmid">30664748</pub-id></element-citation></ref><ref id="bib6"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Chaudhary</surname><given-names>N</given-names></name><name><surname>Wesemann</surname><given-names>DR</given-names></name></person-group><year iso-8601-date="2018">2018</year><article-title>Analyzing immunoglobulin repertoires</article-title><source>Frontiers in Immunology</source><volume>9</volume><elocation-id>462</elocation-id><pub-id pub-id-type="doi">10.3389/fimmu.2018.00462</pub-id><pub-id pub-id-type="pmid">29593723</pub-id></element-citation></ref><ref id="bib7"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Davydov</surname><given-names>AN</given-names></name><name><surname>Obraztsova</surname><given-names>AS</given-names></name><name><surname>Lebedin</surname><given-names>MY</given-names></name><name><surname>Turchaninova</surname><given-names>MA</given-names></name><name><surname>Staroverov</surname><given-names>DB</given-names></name><name><surname>Merzlyak</surname><given-names>EM</given-names></name><name><surname>Sharonov</surname><given-names>GV</given-names></name><name><surname>Kladova</surname><given-names>O</given-names></name><name><surname>Shugay</surname><given-names>M</given-names></name><name><surname>Britanova</surname><given-names>OV</given-names></name><name><surname>Chudakov</surname><given-names>DM</given-names></name></person-group><year iso-8601-date="2018">2018</year><article-title>Comparative analysis of B-cell receptor repertoires induced by live yellow fever vaccine in young and middle-age donors</article-title><source>Frontiers in Immunology</source><volume>9</volume><elocation-id>2309</elocation-id><pub-id pub-id-type="doi">10.3389/fimmu.2018.02309</pub-id><pub-id pub-id-type="pmid">30356675</pub-id></element-citation></ref><ref id="bib8"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>de Bourcy</surname><given-names>CFA</given-names></name><name><surname>Angel</surname><given-names>CJL</given-names></name><name><surname>Vollmers</surname><given-names>C</given-names></name><name><surname>Dekker</surname><given-names>CL</given-names></name><name><surname>Davis</surname><given-names>MM</given-names></name><name><surname>Quake</surname><given-names>SR</given-names></name></person-group><year iso-8601-date="2017">2017</year><article-title>Phylogenetic analysis of the human antibody repertoire reveals quantitative signatures of immune senescence and aging</article-title><source>PNAS</source><volume>114</volume><fpage>1105</fpage><lpage>1110</lpage><pub-id pub-id-type="doi">10.1073/pnas.1617959114</pub-id><pub-id pub-id-type="pmid">28096374</pub-id></element-citation></ref><ref id="bib9"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>De Silva</surname><given-names>NS</given-names></name><name><surname>Klein</surname><given-names>U</given-names></name></person-group><year iso-8601-date="2015">2015</year><article-title>Dynamics of B cells in germinal centres</article-title><source>Nature Reviews. Immunology</source><volume>15</volume><fpage>137</fpage><lpage>148</lpage><pub-id pub-id-type="doi">10.1038/nri3804</pub-id><pub-id pub-id-type="pmid">25656706</pub-id></element-citation></ref><ref id="bib10"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Edgar</surname><given-names>RC</given-names></name></person-group><year iso-8601-date="2004">2004</year><article-title>Muscle: multiple sequence alignment with high accuracy and high throughput</article-title><source>Nucleic Acids Research</source><volume>32</volume><fpage>1792</fpage><lpage>1797</lpage><pub-id pub-id-type="doi">10.1093/nar/gkh340</pub-id><pub-id pub-id-type="pmid">15034147</pub-id></element-citation></ref><ref id="bib11"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gadala-Maria</surname><given-names>D</given-names></name><name><surname>Yaari</surname><given-names>G</given-names></name><name><surname>Uduman</surname><given-names>M</given-names></name><name><surname>Kleinstein</surname><given-names>SH</given-names></name></person-group><year iso-8601-date="2015">2015</year><article-title>Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles</article-title><source>PNAS</source><volume>112</volume><fpage>E862</fpage><lpage>E870</lpage><pub-id pub-id-type="doi">10.1073/pnas.1417683112</pub-id><pub-id pub-id-type="pmid">25675496</pub-id></element-citation></ref><ref id="bib12"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gaebler</surname><given-names>C</given-names></name><name><surname>Wang</surname><given-names>Z</given-names></name><name><surname>Lorenzi</surname><given-names>JCC</given-names></name><name><surname>Muecksch</surname><given-names>F</given-names></name><name><surname>Finkin</surname><given-names>S</given-names></name><name><surname>Tokuyama</surname><given-names>M</given-names></name><name><surname>Cho</surname><given-names>A</given-names></name><name><surname>Jankovic</surname><given-names>M</given-names></name><name><surname>Schaefer-Babajew</surname><given-names>D</given-names></name><name><surname>Oliveira</surname><given-names>TY</given-names></name><name><surname>Cipolla</surname><given-names>M</given-names></name><name><surname>Viant</surname><given-names>C</given-names></name><name><surname>Barnes</surname><given-names>CO</given-names></name><name><surname>Bram</surname><given-names>Y</given-names></name><name><surname>Breton</surname><given-names>G</given-names></name><name><surname>Hägglöf</surname><given-names>T</given-names></name><name><surname>Mendoza</surname><given-names>P</given-names></name><name><surname>Hurley</surname><given-names>A</given-names></name><name><surname>Turroja</surname><given-names>M</given-names></name><name><surname>Gordon</surname><given-names>K</given-names></name><name><surname>Millard</surname><given-names>KG</given-names></name><name><surname>Ramos</surname><given-names>V</given-names></name><name><surname>Schmidt</surname><given-names>F</given-names></name><name><surname>Weisblum</surname><given-names>Y</given-names></name><name><surname>Jha</surname><given-names>D</given-names></name><name><surname>Tankelevich</surname><given-names>M</given-names></name><name><surname>Martinez-Delgado</surname><given-names>G</given-names></name><name><surname>Yee</surname><given-names>J</given-names></name><name><surname>Patel</surname><given-names>R</given-names></name><name><surname>Dizon</surname><given-names>J</given-names></name><name><surname>Unson-O’Brien</surname><given-names>C</given-names></name><name><surname>Shimeliovich</surname><given-names>I</given-names></name><name><surname>Robbiani</surname><given-names>DF</given-names></name><name><surname>Zhao</surname><given-names>Z</given-names></name><name><surname>Gazumyan</surname><given-names>A</given-names></name><name><surname>Schwartz</surname><given-names>RE</given-names></name><name><surname>Hatziioannou</surname><given-names>T</given-names></name><name><surname>Bjorkman</surname><given-names>PJ</given-names></name><name><surname>Mehandru</surname><given-names>S</given-names></name><name><surname>Bieniasz</surname><given-names>PD</given-names></name><name><surname>Caskey</surname><given-names>M</given-names></name><name><surname>Nussenzweig</surname><given-names>MC</given-names></name></person-group><year iso-8601-date="2021">2021</year><article-title>Evolution of antibody immunity to SARS-cov-2</article-title><source>Nature</source><volume>591</volume><fpage>639</fpage><lpage>644</lpage><pub-id pub-id-type="doi">10.1038/s41586-021-03207-w</pub-id><pub-id pub-id-type="pmid">33461210</pub-id></element-citation></ref><ref id="bib13"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Galson</surname><given-names>JD</given-names></name><name><surname>Clutterbuck</surname><given-names>EA</given-names></name><name><surname>Trück</surname><given-names>J</given-names></name><name><surname>Ramasamy</surname><given-names>MN</given-names></name><name><surname>Münz</surname><given-names>M</given-names></name><name><surname>Fowler</surname><given-names>A</given-names></name><name><surname>Cerundolo</surname><given-names>V</given-names></name><name><surname>Pollard</surname><given-names>AJ</given-names></name><name><surname>Lunter</surname><given-names>G</given-names></name><name><surname>Kelly</surname><given-names>DF</given-names></name></person-group><year iso-8601-date="2015">2015</year><article-title>Bcr repertoire sequencing: different patterns of B-cell activation after two meningococcal vaccines</article-title><source>Immunology and Cell Biology</source><volume>93</volume><fpage>885</fpage><lpage>895</lpage><pub-id pub-id-type="doi">10.1038/icb.2015.57</pub-id><pub-id pub-id-type="pmid">25976772</pub-id></element-citation></ref><ref id="bib14"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Garimalla</surname><given-names>S</given-names></name><name><surname>Nguyen</surname><given-names>DC</given-names></name><name><surname>Halliley</surname><given-names>JL</given-names></name><name><surname>Tipton</surname><given-names>C</given-names></name><name><surname>Rosenberg</surname><given-names>AF</given-names></name><name><surname>Fucile</surname><given-names>CF</given-names></name><name><surname>Saney</surname><given-names>CL</given-names></name><name><surname>Kyu</surname><given-names>S</given-names></name><name><surname>Kaminski</surname><given-names>D</given-names></name><name><surname>Qian</surname><given-names>Y</given-names></name><name><surname>Scheuermann</surname><given-names>RH</given-names></name><name><surname>Gibson</surname><given-names>G</given-names></name><name><surname>Sanz</surname><given-names>I</given-names></name><name><surname>Lee</surname><given-names>FEH</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>Differential transcriptome and development of human peripheral plasma cell subsets</article-title><source>JCI Insight</source><volume>4</volume><elocation-id>e126732</elocation-id><pub-id pub-id-type="doi">10.1172/jci.insight.126732</pub-id><pub-id pub-id-type="pmid">31045577</pub-id></element-citation></ref><ref id="bib15"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ghraichy</surname><given-names>M</given-names></name><name><surname>von Niederhäusern</surname><given-names>V</given-names></name><name><surname>Kovaltsuk</surname><given-names>A</given-names></name><name><surname>Galson</surname><given-names>JD</given-names></name><name><surname>Deane</surname><given-names>CM</given-names></name><name><surname>Trück</surname><given-names>J</given-names></name></person-group><year iso-8601-date="2021">2021</year><article-title>Different B cell subpopulations show distinct patterns in their IgH repertoire metrics</article-title><source>eLife</source><volume>10</volume><elocation-id>e73111</elocation-id><pub-id pub-id-type="doi">10.7554/eLife.73111</pub-id><pub-id pub-id-type="pmid">34661527</pub-id></element-citation></ref><ref id="bib16"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gidoni</surname><given-names>M</given-names></name><name><surname>Snir</surname><given-names>O</given-names></name><name><surname>Peres</surname><given-names>A</given-names></name><name><surname>Polak</surname><given-names>P</given-names></name><name><surname>Lindeman</surname><given-names>I</given-names></name><name><surname>Mikocziova</surname><given-names>I</given-names></name><name><surname>Sarna</surname><given-names>VK</given-names></name><name><surname>Lundin</surname><given-names>KEA</given-names></name><name><surname>Clouser</surname><given-names>C</given-names></name><name><surname>Vigneault</surname><given-names>F</given-names></name><name><surname>Collins</surname><given-names>AM</given-names></name><name><surname>Sollid</surname><given-names>LM</given-names></name><name><surname>Yaari</surname><given-names>G</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>Mosaic deletion patterns of the human antibody heavy chain gene locus shown by Bayesian haplotyping</article-title><source>Nature Communications</source><volume>10</volume><elocation-id>628</elocation-id><pub-id pub-id-type="doi">10.1038/s41467-019-08489-3</pub-id><pub-id pub-id-type="pmid">30733445</pub-id></element-citation></ref><ref id="bib17"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Ginestet</surname><given-names>C</given-names></name></person-group><year iso-8601-date="2011">2011</year><article-title>Ggplot2: elegant graphics for data analysis</article-title><source>Journal of the Royal Statistical Society</source><volume>174</volume><fpage>245</fpage><lpage>246</lpage><pub-id pub-id-type="doi">10.1111/j.1467-985X.2010.00676_9.x</pub-id></element-citation></ref><ref id="bib18"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Grimsholm</surname><given-names>O</given-names></name><name><surname>Piano Mortari</surname><given-names>E</given-names></name><name><surname>Davydov</surname><given-names>AN</given-names></name><name><surname>Shugay</surname><given-names>M</given-names></name><name><surname>Obraztsova</surname><given-names>AS</given-names></name><name><surname>Bocci</surname><given-names>C</given-names></name><name><surname>Marasco</surname><given-names>E</given-names></name><name><surname>Marcellini</surname><given-names>V</given-names></name><name><surname>Aranburu</surname><given-names>A</given-names></name><name><surname>Farroni</surname><given-names>C</given-names></name><name><surname>Silvestris</surname><given-names>DA</given-names></name><name><surname>Cristofoletti</surname><given-names>C</given-names></name><name><surname>Giorda</surname><given-names>E</given-names></name><name><surname>Scarsella</surname><given-names>M</given-names></name><name><surname>Cascioli</surname><given-names>S</given-names></name><name><surname>Barresi</surname><given-names>S</given-names></name><name><surname>Lougaris</surname><given-names>V</given-names></name><name><surname>Plebani</surname><given-names>A</given-names></name><name><surname>Cancrini</surname><given-names>C</given-names></name><name><surname>Finocchi</surname><given-names>A</given-names></name><name><surname>Moschese</surname><given-names>V</given-names></name><name><surname>Valentini</surname><given-names>D</given-names></name><name><surname>Vallone</surname><given-names>C</given-names></name><name><surname>Signore</surname><given-names>F</given-names></name><name><surname>de Vincentiis</surname><given-names>G</given-names></name><name><surname>Zaffina</surname><given-names>S</given-names></name><name><surname>Russo</surname><given-names>G</given-names></name><name><surname>Gallo</surname><given-names>A</given-names></name><name><surname>Locatelli</surname><given-names>F</given-names></name><name><surname>Tozzi</surname><given-names>AE</given-names></name><name><surname>Tartaglia</surname><given-names>M</given-names></name><name><surname>Chudakov</surname><given-names>DM</given-names></name><name><surname>Carsetti</surname><given-names>R</given-names></name></person-group><year iso-8601-date="2011">2011</year><article-title>The interplay between cd27dull and cd27bright B cells ensures the flexibility, stability, and resilience of human B cell memory</article-title><source>Cell Reports</source><volume>30</volume><fpage>2963</fpage><lpage>2977</lpage><pub-id pub-id-type="doi">10.1016/j.celrep.2020.02.022</pub-id><pub-id pub-id-type="pmid">32130900</pub-id></element-citation></ref><ref id="bib19"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname><given-names>NT</given-names></name><name><surname>Vander Heiden</surname><given-names>JA</given-names></name><name><surname>Uduman</surname><given-names>M</given-names></name><name><surname>Gadala-Maria</surname><given-names>D</given-names></name><name><surname>Yaari</surname><given-names>G</given-names></name><name><surname>Kleinstein</surname><given-names>SH</given-names></name></person-group><year iso-8601-date="2015">2015</year><article-title>Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data</article-title><source>Bioinformatics</source><volume>31</volume><fpage>3356</fpage><lpage>3358</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btv359</pub-id><pub-id pub-id-type="pmid">26069265</pub-id></element-citation></ref><ref id="bib20"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Hoehn</surname><given-names>KB</given-names></name><name><surname>Turner</surname><given-names>JS</given-names></name><name><surname>Miller</surname><given-names>FI</given-names></name><name><surname>Jiang</surname><given-names>R</given-names></name><name><surname>Pybus</surname><given-names>OG</given-names></name><name><surname>Ellebedy</surname><given-names>AH</given-names></name><name><surname>Kleinstein</surname><given-names>SH</given-names></name></person-group><year iso-8601-date="2021">2021</year><article-title>Human B cell lineages associated with germinal centers following influenza vaccination are measurably evolving</article-title><source>eLife</source><volume>10</volume><elocation-id>e70873</elocation-id><pub-id pub-id-type="doi">10.7554/eLife.70873</pub-id><pub-id pub-id-type="pmid">34787567</pub-id></element-citation></ref><ref id="bib21"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Horns</surname><given-names>F</given-names></name><name><surname>Vollmers</surname><given-names>C</given-names></name><name><surname>Dekker</surname><given-names>CL</given-names></name><name><surname>Quake</surname><given-names>SR</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>Signatures of selection in the human antibody repertoire: selective sweeps, competing subclones, and neutral drift</article-title><source>PNAS</source><volume>116</volume><fpage>1261</fpage><lpage>1266</lpage><pub-id pub-id-type="doi">10.1073/pnas.1814213116</pub-id><pub-id pub-id-type="pmid">30622180</pub-id></element-citation></ref><ref id="bib22"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Laserson</surname><given-names>U</given-names></name><name><surname>Vigneault</surname><given-names>F</given-names></name><name><surname>Gadala-Maria</surname><given-names>D</given-names></name><name><surname>Yaari</surname><given-names>G</given-names></name><name><surname>Uduman</surname><given-names>M</given-names></name><name><surname>Vander Heiden</surname><given-names>JA</given-names></name><name><surname>Kelton</surname><given-names>W</given-names></name><name><surname>Taek Jung</surname><given-names>S</given-names></name><name><surname>Liu</surname><given-names>Y</given-names></name><name><surname>Laserson</surname><given-names>J</given-names></name><name><surname>Chari</surname><given-names>R</given-names></name><name><surname>Lee</surname><given-names>JH</given-names></name><name><surname>Bachelet</surname><given-names>I</given-names></name><name><surname>Hickey</surname><given-names>B</given-names></name><name><surname>Lieberman-Aiden</surname><given-names>E</given-names></name><name><surname>Hanczaruk</surname><given-names>B</given-names></name><name><surname>Simen</surname><given-names>BB</given-names></name><name><surname>Egholm</surname><given-names>M</given-names></name><name><surname>Koller</surname><given-names>D</given-names></name><name><surname>Georgiou</surname><given-names>G</given-names></name><name><surname>Kleinstein</surname><given-names>SH</given-names></name><name><surname>Church</surname><given-names>GM</given-names></name></person-group><year iso-8601-date="2014">2014</year><article-title>High-Resolution antibody dynamics of vaccine-induced immune responses</article-title><source>PNAS</source><volume>111</volume><fpage>4928</fpage><lpage>4933</lpage><pub-id pub-id-type="doi">10.1073/pnas.1323862111</pub-id><pub-id pub-id-type="pmid">24639495</pub-id></element-citation></ref><ref id="bib23"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mandric</surname><given-names>I</given-names></name><name><surname>Rotman</surname><given-names>J</given-names></name><name><surname>Yang</surname><given-names>HT</given-names></name><name><surname>Strauli</surname><given-names>N</given-names></name><name><surname>Montoya</surname><given-names>DJ</given-names></name><name><surname>Van Der Wey</surname><given-names>W</given-names></name><name><surname>Ronas</surname><given-names>JR</given-names></name><name><surname>Statz</surname><given-names>B</given-names></name><name><surname>Yao</surname><given-names>D</given-names></name><name><surname>Petrova</surname><given-names>V</given-names></name><name><surname>Zelikovsky</surname><given-names>A</given-names></name><name><surname>Spreafico</surname><given-names>R</given-names></name><name><surname>Shifman</surname><given-names>S</given-names></name><name><surname>Zaitlen</surname><given-names>N</given-names></name><name><surname>Rossetti</surname><given-names>M</given-names></name><name><surname>Ansel</surname><given-names>KM</given-names></name><name><surname>Eskin</surname><given-names>E</given-names></name><name><surname>Mangul</surname><given-names>S</given-names></name></person-group><year iso-8601-date="2020">2020</year><article-title>Profiling immunoglobulin repertoires across multiple human tissues using RNA sequencing</article-title><source>Nature Communications</source><volume>11</volume><elocation-id>3126</elocation-id><pub-id pub-id-type="doi">10.1038/s41467-020-16857-7</pub-id><pub-id pub-id-type="pmid">32561710</pub-id></element-citation></ref><ref id="bib24"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>McDonald</surname><given-names>JH</given-names></name><name><surname>Kreitman</surname><given-names>M</given-names></name></person-group><year iso-8601-date="1991">1991</year><article-title>Adaptive protein evolution at the Adh locus in <italic>Drosophila</italic></article-title><source>Nature</source><volume>351</volume><fpage>652</fpage><lpage>654</lpage><pub-id pub-id-type="doi">10.1038/351652a0</pub-id><pub-id pub-id-type="pmid">1904993</pub-id></element-citation></ref><ref id="bib25"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Mitsunaga</surname><given-names>EM</given-names></name><name><surname>Snyder</surname><given-names>MP</given-names></name></person-group><year iso-8601-date="2020">2020</year><article-title>Deep characterization of the human antibody response to natural infection using longitudinal immune repertoire sequencing</article-title><source>Molecular & Cellular Proteomics</source><volume>19</volume><fpage>278</fpage><lpage>293</lpage><pub-id pub-id-type="doi">10.1074/mcp.RA119.001633</pub-id><pub-id pub-id-type="pmid">31767621</pub-id></element-citation></ref><ref id="bib26"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Neher</surname><given-names>RA</given-names></name><name><surname>Hallatschek</surname><given-names>O</given-names></name></person-group><year iso-8601-date="2013">2013</year><article-title>Genealogies of rapidly adapting populations</article-title><source>PNAS</source><volume>110</volume><fpage>437</fpage><lpage>442</lpage><pub-id pub-id-type="doi">10.1073/pnas.1213113110</pub-id><pub-id pub-id-type="pmid">23269838</pub-id></element-citation></ref><ref id="bib27"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nei</surname><given-names>M</given-names></name><name><surname>Gojobori</surname><given-names>T</given-names></name></person-group><year iso-8601-date="1986">1986</year><article-title>Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions</article-title><source>Molecular Biology and Evolution</source><volume>3</volume><fpage>418</fpage><lpage>426</lpage><pub-id pub-id-type="doi">10.1093/oxfordjournals.molbev.a040410</pub-id><pub-id pub-id-type="pmid">3444411</pub-id></element-citation></ref><ref id="bib28"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Nei</surname><given-names>M</given-names></name><name><surname>Kumar</surname><given-names>S</given-names></name></person-group><year iso-8601-date="2000">2000</year><source>Molecular Evolution and Phylogenetics</source><publisher-loc>Oxford ; New York</publisher-loc><publisher-name>Oxford University Press</publisher-name></element-citation></ref><ref id="bib29"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nielsen</surname><given-names>R</given-names></name></person-group><year iso-8601-date="2005">2005</year><article-title>Molecular signatures of natural selection</article-title><source>Annual Review of Genetics</source><volume>39</volume><fpage>197</fpage><lpage>218</lpage><pub-id pub-id-type="doi">10.1146/annurev.genet.39.073003.112420</pub-id><pub-id pub-id-type="pmid">16285858</pub-id></element-citation></ref><ref id="bib30"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nielsen</surname><given-names>SCA</given-names></name><name><surname>Yang</surname><given-names>F</given-names></name><name><surname>Jackson</surname><given-names>KJL</given-names></name><name><surname>Hoh</surname><given-names>RA</given-names></name><name><surname>Röltgen</surname><given-names>K</given-names></name><name><surname>Jean</surname><given-names>GH</given-names></name><name><surname>Stevens</surname><given-names>BA</given-names></name><name><surname>Lee</surname><given-names>JY</given-names></name><name><surname>Rustagi</surname><given-names>A</given-names></name><name><surname>Rogers</surname><given-names>AJ</given-names></name><name><surname>Powell</surname><given-names>AE</given-names></name><name><surname>Hunter</surname><given-names>M</given-names></name><name><surname>Najeeb</surname><given-names>J</given-names></name><name><surname>Otrelo-Cardoso</surname><given-names>AR</given-names></name><name><surname>Yost</surname><given-names>KE</given-names></name><name><surname>Daniel</surname><given-names>B</given-names></name><name><surname>Nadeau</surname><given-names>KC</given-names></name><name><surname>Chang</surname><given-names>HY</given-names></name><name><surname>Satpathy</surname><given-names>AT</given-names></name><name><surname>Jardetzky</surname><given-names>TS</given-names></name><name><surname>Kim</surname><given-names>PS</given-names></name><name><surname>Wang</surname><given-names>TT</given-names></name><name><surname>Pinsky</surname><given-names>BA</given-names></name><name><surname>Blish</surname><given-names>CA</given-names></name><name><surname>Boyd</surname><given-names>SD</given-names></name></person-group><year iso-8601-date="2020">2020</year><article-title>Human B cell clonal expansion and convergent antibody responses to SARS-cov-2</article-title><source>Cell Host & Microbe</source><volume>28</volume><fpage>516</fpage><lpage>525</lpage><pub-id pub-id-type="doi">10.1016/j.chom.2020.09.002</pub-id><pub-id pub-id-type="pmid">32941787</pub-id></element-citation></ref><ref id="bib31"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Nourmohammad</surname><given-names>A</given-names></name><name><surname>Otwinowski</surname><given-names>J</given-names></name><name><surname>Łuksza</surname><given-names>M</given-names></name><name><surname>Mora</surname><given-names>T</given-names></name><name><surname>Walczak</surname><given-names>AM</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>Fierce selection and interference in B-cell repertoire response to chronic HIV-1</article-title><source>Molecular Biology and Evolution</source><volume>36</volume><fpage>2184</fpage><lpage>2194</lpage><pub-id pub-id-type="doi">10.1093/molbev/msz143</pub-id><pub-id pub-id-type="pmid">31209469</pub-id></element-citation></ref><ref id="bib32"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Peng</surname><given-names>W</given-names></name><name><surname>Liu</surname><given-names>S</given-names></name><name><surname>Meng</surname><given-names>J</given-names></name><name><surname>Huang</surname><given-names>J</given-names></name><name><surname>Huang</surname><given-names>J</given-names></name><name><surname>Tang</surname><given-names>D</given-names></name><name><surname>Dai</surname><given-names>Y</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>Profiling the TRB and IgH repertoire of patients with H5N6 avian influenza virus infection by high-throughput sequencing</article-title><source>Scientific Reports</source><volume>9</volume><elocation-id>7429</elocation-id><pub-id pub-id-type="doi">10.1038/s41598-019-43648-y</pub-id><pub-id pub-id-type="pmid">31092835</pub-id></element-citation></ref><ref id="bib33"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Perez-Andres</surname><given-names>M</given-names></name><name><surname>Paiva</surname><given-names>B</given-names></name><name><surname>Nieto</surname><given-names>WG</given-names></name><name><surname>Caraux</surname><given-names>A</given-names></name><name><surname>Schmitz</surname><given-names>A</given-names></name><name><surname>Almeida</surname><given-names>J</given-names></name><name><surname>Vogt</surname><given-names>RF</given-names></name><name><surname>Marti</surname><given-names>GE</given-names></name><name><surname>Rawstron</surname><given-names>AC</given-names></name><name><surname>Van Zelm</surname><given-names>MC</given-names></name><name><surname>Van Dongen</surname><given-names>JJM</given-names></name><name><surname>Johnsen</surname><given-names>HE</given-names></name><name><surname>Klein</surname><given-names>B</given-names></name><name><surname>Orfao</surname><given-names>A</given-names></name><collab>Primary Health Care Group of Salamanca for the Study of MBL</collab></person-group><year iso-8601-date="2010">2010</year><article-title>Human peripheral blood B-cell compartments: a crossroad in B-cell traffic</article-title><source>Cytometry. Part B, Clinical Cytometry</source><volume>78 Suppl 1</volume><fpage>S47</fpage><lpage>S60</lpage><pub-id pub-id-type="doi">10.1002/cyto.b.20547</pub-id><pub-id pub-id-type="pmid">20839338</pub-id></element-citation></ref><ref id="bib34"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Phad</surname><given-names>GE</given-names></name><name><surname>Pinto</surname><given-names>D</given-names></name><name><surname>Foglierini</surname><given-names>M</given-names></name><name><surname>Akhmedov</surname><given-names>M</given-names></name><name><surname>Rossi</surname><given-names>RL</given-names></name><name><surname>Malvicini</surname><given-names>E</given-names></name><name><surname>Cassotta</surname><given-names>A</given-names></name><name><surname>Fregni</surname><given-names>CS</given-names></name><name><surname>Bruno</surname><given-names>L</given-names></name><name><surname>Sallusto</surname><given-names>F</given-names></name><name><surname>Lanzavecchia</surname><given-names>A</given-names></name></person-group><year iso-8601-date="2022">2022</year><article-title>Clonal structure, stability and dynamics of human memory B cells and circulating plasmablasts</article-title><source>Nature Immunology</source><volume>23</volume><fpage>1</fpage><lpage>10</lpage><pub-id pub-id-type="doi">10.1038/s41590-022-01230-1</pub-id><pub-id pub-id-type="pmid">35761085</pub-id></element-citation></ref><ref id="bib35"><element-citation publication-type="software"><person-group person-group-type="author"><collab>R Development Core Team</collab></person-group><year iso-8601-date="2018">2018</year><data-title>R: a language and environment for statistical computing. R foundation for statistical computing</data-title><publisher-loc>Vienna, Austria</publisher-loc><publisher-name>R Foundation for Statistical Computing</publisher-name><ext-link ext-link-type="uri" xlink:href="https://www.R-project.org/">https://www.R-project.org/</ext-link></element-citation></ref><ref id="bib36"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Robinson</surname><given-names>MD</given-names></name><name><surname>McCarthy</surname><given-names>DJ</given-names></name><name><surname>Smyth</surname><given-names>GK</given-names></name></person-group><year iso-8601-date="2010">2010</year><article-title>EdgeR: a Bioconductor package for differential expression analysis of digital gene expression data</article-title><source>Bioinformatics</source><volume>26</volume><fpage>139</fpage><lpage>140</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btp616</pub-id><pub-id pub-id-type="pmid">19910308</pub-id></element-citation></ref><ref id="bib37"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Robinson</surname><given-names>MD</given-names></name><name><surname>Oshlack</surname><given-names>A</given-names></name></person-group><year iso-8601-date="2010">2010</year><article-title>A scaling normalization method for differential expression analysis of RNA-Seq data</article-title><source>Genome Biology</source><volume>11</volume><elocation-id>R25</elocation-id><pub-id pub-id-type="doi">10.1186/gb-2010-11-3-r25</pub-id><pub-id pub-id-type="pmid">20196867</pub-id></element-citation></ref><ref id="bib38"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sakharkar</surname><given-names>M</given-names></name><name><surname>Rappazzo</surname><given-names>CG</given-names></name><name><surname>Wieland-Alter</surname><given-names>WF</given-names></name><name><surname>Hsieh</surname><given-names>CL</given-names></name><name><surname>Wrapp</surname><given-names>D</given-names></name><name><surname>Esterman</surname><given-names>ES</given-names></name><name><surname>Kaku</surname><given-names>CI</given-names></name><name><surname>Wec</surname><given-names>AZ</given-names></name><name><surname>Geoghegan</surname><given-names>JC</given-names></name><name><surname>McLellan</surname><given-names>JS</given-names></name><name><surname>Connor</surname><given-names>RI</given-names></name><name><surname>Wright</surname><given-names>PF</given-names></name><name><surname>Walker</surname><given-names>LM</given-names></name></person-group><year iso-8601-date="2021">2021</year><article-title>Prolonged evolution of the human B cell response to SARS-cov-2 infection</article-title><source>Science Immunology</source><volume>6</volume><elocation-id>eabg6916</elocation-id><pub-id pub-id-type="doi">10.1126/sciimmunol.abg6916</pub-id><pub-id pub-id-type="pmid">33622975</pub-id></element-citation></ref><ref id="bib39"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Sethna</surname><given-names>Z</given-names></name><name><surname>Elhanati</surname><given-names>Y</given-names></name><name><surname>Callan</surname><given-names>CG</given-names></name><name><surname>Walczak</surname><given-names>AM</given-names></name><name><surname>Mora</surname><given-names>T</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs</article-title><source>Bioinformatics</source><volume>35</volume><fpage>2974</fpage><lpage>2981</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btz035</pub-id><pub-id pub-id-type="pmid">30657870</pub-id></element-citation></ref><ref id="bib40"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shah</surname><given-names>HB</given-names></name><name><surname>Smith</surname><given-names>K</given-names></name><name><surname>Wren</surname><given-names>JD</given-names></name><name><surname>Webb</surname><given-names>CF</given-names></name><name><surname>Ballard</surname><given-names>JD</given-names></name><name><surname>Bourn</surname><given-names>RL</given-names></name><name><surname>James</surname><given-names>JA</given-names></name><name><surname>Lang</surname><given-names>ML</given-names></name></person-group><year iso-8601-date="2018">2018</year><article-title>Insights from analysis of human antigen-specific memory B cell repertoires</article-title><source>Frontiers in Immunology</source><volume>9</volume><elocation-id>3064</elocation-id><pub-id pub-id-type="doi">10.3389/fimmu.2018.03064</pub-id><pub-id pub-id-type="pmid">30697210</pub-id></element-citation></ref><ref id="bib41"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Shugay</surname><given-names>M</given-names></name><name><surname>Britanova</surname><given-names>OV</given-names></name><name><surname>Merzlyak</surname><given-names>EM</given-names></name><name><surname>Turchaninova</surname><given-names>MA</given-names></name><name><surname>Mamedov</surname><given-names>IZ</given-names></name><name><surname>Tuganbaev</surname><given-names>TR</given-names></name><name><surname>Bolotin</surname><given-names>DA</given-names></name><name><surname>Staroverov</surname><given-names>DB</given-names></name><name><surname>Putintseva</surname><given-names>EV</given-names></name><name><surname>Plevova</surname><given-names>K</given-names></name><name><surname>Linnemann</surname><given-names>C</given-names></name><name><surname>Shagin</surname><given-names>D</given-names></name><name><surname>Pospisilova</surname><given-names>S</given-names></name><name><surname>Lukyanov</surname><given-names>S</given-names></name><name><surname>Schumacher</surname><given-names>TN</given-names></name><name><surname>Chudakov</surname><given-names>DM</given-names></name></person-group><year iso-8601-date="2014">2014</year><article-title>Towards error-free profiling of immune repertoires</article-title><source>Nature Methods</source><volume>11</volume><fpage>653</fpage><lpage>655</lpage><pub-id pub-id-type="doi">10.1038/nmeth.2960</pub-id><pub-id pub-id-type="pmid">24793455</pub-id></element-citation></ref><ref id="bib42"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Soto</surname><given-names>C</given-names></name><name><surname>Bombardi</surname><given-names>RG</given-names></name><name><surname>Branchizio</surname><given-names>A</given-names></name><name><surname>Kose</surname><given-names>N</given-names></name><name><surname>Matta</surname><given-names>P</given-names></name><name><surname>Sevy</surname><given-names>AM</given-names></name><name><surname>Sinkovits</surname><given-names>RS</given-names></name><name><surname>Gilchuk</surname><given-names>P</given-names></name><name><surname>Finn</surname><given-names>JA</given-names></name><name><surname>Crowe</surname><given-names>JE</given-names></name></person-group><year iso-8601-date="2019">2019</year><article-title>High frequency of shared clonotypes in human B cell receptor repertoires</article-title><source>Nature</source><volume>566</volume><fpage>398</fpage><lpage>402</lpage><pub-id pub-id-type="doi">10.1038/s41586-019-0934-8</pub-id><pub-id pub-id-type="pmid">30760926</pub-id></element-citation></ref><ref id="bib43"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stamatakis</surname><given-names>A</given-names></name></person-group><year iso-8601-date="2014">2014</year><article-title>RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies</article-title><source>Bioinformatics</source><volume>30</volume><fpage>1312</fpage><lpage>1313</lpage><pub-id pub-id-type="doi">10.1093/bioinformatics/btu033</pub-id><pub-id pub-id-type="pmid">24451623</pub-id></element-citation></ref><ref id="bib44"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Stavnezer</surname><given-names>J</given-names></name><name><surname>Guikema</surname><given-names>JEJ</given-names></name><name><surname>Schrader</surname><given-names>CE</given-names></name></person-group><year iso-8601-date="2008">2008</year><article-title>Mechanism and regulation of class switch recombination</article-title><source>Annual Review of Immunology</source><volume>26</volume><fpage>261</fpage><lpage>292</lpage><pub-id pub-id-type="doi">10.1146/annurev.immunol.26.021607.090248</pub-id><pub-id pub-id-type="pmid">18370922</pub-id></element-citation></ref><ref id="bib45"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Turchaninova</surname><given-names>MA</given-names></name><name><surname>Davydov</surname><given-names>A</given-names></name><name><surname>Britanova</surname><given-names>OV</given-names></name><name><surname>Shugay</surname><given-names>M</given-names></name><name><surname>Bikos</surname><given-names>V</given-names></name><name><surname>Egorov</surname><given-names>ES</given-names></name><name><surname>Kirgizova</surname><given-names>VI</given-names></name><name><surname>Merzlyak</surname><given-names>EM</given-names></name><name><surname>Staroverov</surname><given-names>DB</given-names></name><name><surname>Bolotin</surname><given-names>DA</given-names></name><name><surname>Mamedov</surname><given-names>IZ</given-names></name><name><surname>Izraelson</surname><given-names>M</given-names></name><name><surname>Logacheva</surname><given-names>MD</given-names></name><name><surname>Kladova</surname><given-names>O</given-names></name><name><surname>Plevova</surname><given-names>K</given-names></name><name><surname>Pospisilova</surname><given-names>S</given-names></name><name><surname>Chudakov</surname><given-names>DM</given-names></name></person-group><year iso-8601-date="2016">2016</year><article-title>High-Quality full-length immunoglobulin profiling with unique molecular barcoding</article-title><source>Nature Protocols</source><volume>11</volume><fpage>1599</fpage><lpage>1616</lpage><pub-id pub-id-type="doi">10.1038/nprot.2016.093</pub-id><pub-id pub-id-type="pmid">27490633</pub-id></element-citation></ref><ref id="bib46"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Vidarsson</surname><given-names>G</given-names></name><name><surname>Dekkers</surname><given-names>G</given-names></name><name><surname>Rispens</surname><given-names>T</given-names></name></person-group><year iso-8601-date="2014">2014</year><article-title>Igg subclasses and allotypes: from structure to effector functions</article-title><source>Frontiers in Immunology</source><volume>5</volume><elocation-id>520</elocation-id><pub-id pub-id-type="doi">10.3389/fimmu.2014.00520</pub-id><pub-id pub-id-type="pmid">25368619</pub-id></element-citation></ref><ref id="bib47"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname><given-names>YC</given-names></name><name><surname>Kipling</surname><given-names>D</given-names></name><name><surname>Leong</surname><given-names>HS</given-names></name><name><surname>Martin</surname><given-names>V</given-names></name><name><surname>Ademokun</surname><given-names>AA</given-names></name><name><surname>Dunn-Walters</surname><given-names>DK</given-names></name></person-group><year iso-8601-date="2010">2010</year><article-title>High-Throughput immunoglobulin repertoire analysis distinguishes between human IgM memory and switched memory B-cell populations</article-title><source>Blood</source><volume>116</volume><fpage>1070</fpage><lpage>1078</lpage><pub-id pub-id-type="doi">10.1182/blood-2010-03-275859</pub-id><pub-id pub-id-type="pmid">20457872</pub-id></element-citation></ref><ref id="bib48"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname><given-names>X</given-names></name><name><surname>Wang</surname><given-names>M</given-names></name><name><surname>Wu</surname><given-names>J</given-names></name><name><surname>Shi</surname><given-names>D</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Zeng</surname><given-names>H</given-names></name><name><surname>Zhu</surname><given-names>Y</given-names></name><name><surname>Lan</surname><given-names>C</given-names></name><name><surname>Deng</surname><given-names>Y</given-names></name><name><surname>Guo</surname><given-names>S</given-names></name><name><surname>Xu</surname><given-names>L</given-names></name><name><surname>Ma</surname><given-names>C</given-names></name><name><surname>Zhang</surname><given-names>Y</given-names></name><name><surname>Ou</surname><given-names>J</given-names></name><name><surname>Liu</surname><given-names>CJ</given-names></name><name><surname>Chen</surname><given-names>Y</given-names></name><name><surname>Wang</surname><given-names>Q</given-names></name><name><surname>Xie</surname><given-names>W</given-names></name><name><surname>Guan</surname><given-names>J</given-names></name><name><surname>Ding</surname><given-names>J</given-names></name><name><surname>Wang</surname><given-names>Z</given-names></name><name><surname>Chang</surname><given-names>C</given-names></name><name><surname>Yang</surname><given-names>W</given-names></name><name><surname>Zhang</surname><given-names>H</given-names></name><name><surname>Chen</surname><given-names>J</given-names></name><name><surname>Qin</surname><given-names>L</given-names></name><name><surname>Zhou</surname><given-names>H</given-names></name><name><surname>Bei</surname><given-names>JX</given-names></name><name><surname>Wei</surname><given-names>L</given-names></name><name><surname>Cao</surname><given-names>G</given-names></name><name><surname>Yu</surname><given-names>X</given-names></name><name><surname>Zhang</surname><given-names>Z</given-names></name></person-group><year iso-8601-date="2021">2021</year><article-title>Large-Scale analysis of 2,152 ig-seq datasets reveals key features of B cell biology and the antibody repertoire</article-title><source>Cell Reports</source><volume>35</volume><elocation-id>109110</elocation-id><pub-id pub-id-type="doi">10.1016/j.celrep.2021.109110</pub-id><pub-id pub-id-type="pmid">33979623</pub-id></element-citation></ref><ref id="bib49"><element-citation publication-type="book"><person-group person-group-type="author"><name><surname>Yu</surname><given-names>G</given-names></name><name><surname>Smith</surname><given-names>DK</given-names></name><name><surname>Zhu</surname><given-names>H</given-names></name><name><surname>Guan</surname><given-names>Y</given-names></name><name><surname>Tommy</surname><given-names>TYL</given-names></name></person-group><year iso-8601-date="2017">2017</year><chapter-title>Ggtree: package for visualization and annotation of phylogenetic trees with their covariates and other associated data</chapter-title><person-group person-group-type="editor"><name><surname>McInerny</surname><given-names>Greg</given-names></name></person-group><source>Methods in Ecology and Evolution</source><publisher-name>British ecological society</publisher-name><fpage>28</fpage><lpage>36</lpage><pub-id pub-id-type="doi">10.1111/2041-210X.12628</pub-id></element-citation></ref><ref id="bib50"><element-citation publication-type="journal"><person-group person-group-type="author"><name><surname>Zvyagin</surname><given-names>IV</given-names></name><name><surname>Mamedov</surname><given-names>IZ</given-names></name><name><surname>Tatarinova</surname><given-names>OV</given-names></name><name><surname>Komech</surname><given-names>EA</given-names></name><name><surname>Kurnikova</surname><given-names>EE</given-names></name><name><surname>Boyakova</surname><given-names>EV</given-names></name><name><surname>Brilliantova</surname><given-names>V</given-names></name><name><surname>Shelikhova</surname><given-names>LN</given-names></name><name><surname>Balashov</surname><given-names>DN</given-names></name><name><surname>Shugay</surname><given-names>M</given-names></name><name><surname>Sycheva</surname><given-names>AL</given-names></name><name><surname>Kasatskaya</surname><given-names>SA</given-names></name><name><surname>Lebedev</surname><given-names>YB</given-names></name><name><surname>Maschan</surname><given-names>AA</given-names></name><name><surname>Maschan</surname><given-names>MA</given-names></name><name><surname>Chudakov</surname><given-names>DM</given-names></name></person-group><year iso-8601-date="2017">2017</year><article-title>Tracking T-cell immune reconstitution after TCRαβ/CD19-depleted hematopoietic cells transplantation in children</article-title><source>Leukemia</source><volume>31</volume><fpage>1145</fpage><lpage>1153</lpage><pub-id pub-id-type="doi">10.1038/leu.2016.321</pub-id><pub-id pub-id-type="pmid">27811849</pub-id></element-citation></ref></ref-list><app-group><app id="appendix-1"><title>Appendix 1</title><p>Four donors (D01, IM, AT, MRK) in our cohort had allergic rhinitis (AR) to pollen. In this note we investigated the influence of donor allergy status on IGH repertoire characteristics described in our study. Peripheral blood samples were collected at three time points: two before pollination season (T1 – March 2017, T3 – March 2018), and one at the peak of birch pollination season in May (T2 – May 2017). Both total and specific IgE serum levels in most AR donors were elevated, while in healthy donors IgE levels were below clinical thresholds (measured with IMMULITE 2000, Siemens). The number of shared clonotypes between donors was not affected by allergy status (<xref ref-type="fig" rid="app1fig1">Appendix 1—figure 1</xref>). By focusing on the IgE clonotypes we have not detected shared clonotypes or clonal groups in ASCs subsets of AR donors. We also detected no differences in IGHV gene segment usage between donors depending on allergy status (<xref ref-type="fig" rid="app1fig2">Appendix 1—figures 2</xref>–<xref ref-type="fig" rid="app1fig3">4</xref>).</p><fig id="app1fig1" position="float"><label>Appendix 1—figure 1.</label><caption><title>Number of shared clonotypes between repertoires of donors without (<bold>H</bold>) or with allergic rhinitis (AR).</title><p>Each dot represents the number of shared clonotypes between a pair of donors. Comparison between groups was performed using the Mann-Whitney U test. ns corresponds to p ≥ 0.05.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/app1-fig1.jpg"/></fig><fig id="app1fig2" position="float"><label>Appendix 1—figure 2.</label><caption><title>Correlation plot of average IGHV gene segment frequencies of memory B cell subset.</title><p>Down-left part: the dot plots of the average IGHV gene segment frequencies for each pair of donors (axes represent average frequencies, each dot represents a particular IGHV segment for corresponding donors). Top right: Pearson correlation of average frequencies of IGHV gene segment in memory B cell subset between donors, notation of the level of significance is as follows: *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/app1-fig2.jpg"/></fig><fig id="app1fig3" position="float"><label>Appendix 1—figure 3.</label><caption><title>Correlation plot of average IGHV gene segment frequencies of the plasmablast subset.</title><p>Down-left part: the dot plots of the average IGHV gene segment frequencies for each pair of donors (axes represent average frequencies, each dot represents a particular IGHV segment for corresponding donors). Top right: Pearson correlation of average frequencies of IGHV gene segment in plasmablast subset between donors, notation of the level of significance is as follows: *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/app1-fig3.jpg"/></fig><p>We observed no effect of pollination season on the dynamics of most abundant clonal lineages as well. Clonal lineages of HBmem cluster had stable repertoire frequency in all three time points in all donors, suggesting no active involvement in ongoing immune response. In contrast LBmem lineages showed features of active immune response: they were composed mostly of ASC clonotypes and exhibited fluctuations in frequency, reflecting changes in the number of detected clonotypes during the observed time period. However, the observed dynamics of LBmem lineages was not clearly correlated with the pollen season in allergic donors (<xref ref-type="fig" rid="fig3s2">Figure 3—figure supplement 2</xref>).</p><p>Together this data show that the potential reactivation of allergen-specific B cells, which could be expected during pollination season in AR donors, cannot be easily detected on the level of bulk repertoire sequencing. Thus it should not affect the repertoire features of Bmem and ASCs subsets and clonal lineage dynamics. Analysis of allergen-responsive clones and clonal groups requires more precise focusing of sampled B cell subpopulations and sampling time points.</p><fig id="app1fig4" position="float"><label>Appendix 1—figure 4.</label><caption><title>Correlation plot of average IGHV gene segment frequencies of plasma cell subset.</title><p>Down-left part: the dot plots of the average IGHV gene segment frequencies for each pair of donors (axes represent average frequencies, each dot represents a particular IGHV segment for corresponding donors). Top right: Pearson correlation of average frequencies of IGHV gene segment in plasma cell subset between donors, notation of the level of significance is as follows: *=p ≤ 0.05, **=p ≤ 0.01, ***=p ≤ 10<sup>–3</sup>.</p></caption><graphic mimetype="image" mime-subtype="jpeg" xlink:href="elife-79254.xml.media/app1-fig4.jpg"/></fig></app></app-group></back><sub-article article-type="editor-report" id="sa0"><front-stub><article-id pub-id-type="doi">10.7554/eLife.79254.sa0</article-id><title-group><article-title>Editor's evaluation</article-title></title-group><contrib-group><contrib contrib-type="author"><name><surname>Kurosaki</surname><given-names>Tomohiro</given-names></name><role specific-use="editor">Reviewing Editor</role><aff><institution-wrap><institution-id institution-id-type="ror">https://ror.org/035t8zc32</institution-id><institution>Osaka University</institution></institution-wrap><country>Japan</country></aff></contrib></contrib-group><related-object id="sa0ro1" object-id-type="id" object-id="10.1101/2021.12.30.474135" link-type="continued-by" xlink:href="https://sciety.org/articles/activity/10.1101/2021.12.30.474135"/></front-stub><body><p>By performing homeostatic longitudinal IgH repertoire analysis of human memory B cells and plasma cells, authors draw two major unique conclusions; first, a high degree of clonal persistence in individual memory B cell subsets with inter individual convergence in memory B cells and plasma cells; second, reactivation of persisting memory B cells with new rounds of affinity maturation during proliferation and differentiation into plasma cells. These conclusions provide significant insight into how human memory B and plasma cells are generated in a homeostatic condition.</p></body></sub-article><sub-article article-type="decision-letter" id="sa1"><front-stub><article-id pub-id-type="doi">10.7554/eLife.79254.sa1</article-id><title-group><article-title>Decision letter</article-title></title-group><contrib-group content-type="section"><contrib contrib-type="editor"><name><surname>Kurosaki</surname><given-names>Tomohiro</given-names></name><role>Reviewing Editor</role><aff><institution-wrap><institution-id institution-id-type="ror">https://ror.org/035t8zc32</institution-id><institution>Osaka University</institution></institution-wrap><country>Japan</country></aff></contrib></contrib-group><contrib-group><contrib contrib-type="reviewer"><name><surname>Liu</surname><given-names>Wanli</given-names></name><role>Reviewer</role><aff><institution-wrap><institution-id institution-id-type="ror">https://ror.org/03cve4549</institution-id><institution>Tsinghua University</institution></institution-wrap><country>China</country></aff></contrib></contrib-group></front-stub><body><boxed-text id="sa2-box1"><p>Our editorial process produces two outputs: (i) <ext-link ext-link-type="uri" xlink:href="https://sciety.org/articles/activity/10.1101/2021.12.30.474135">public reviews</ext-link> designed to be posted alongside <ext-link ext-link-type="uri" xlink:href="https://www.biorxiv.org/content/10.1101/2021.12.30.474135v2">the preprint</ext-link> for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.</p></boxed-text><p><bold>Decision letter after peer review:</bold></p><p>Thank you for submitting your article "Memory persistence and differentiation into antibody-secreting cells accompanied by positive selection in longitudinal BCR repertoires" for consideration by <italic>eLife</italic>. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Betty Diamond as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Wanli Liu (Reviewer #2).</p><p>The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.</p><p>Essential revisions:</p><p>1) Throughout the manuscript, the conclusion of each section will help to catch the findings of this study. Please consider including short summaries at the end of each section.</p><p>(1.1) p3, line 100; p6, line 179; p17, line 459; p19, line 581; Strictly speaking, "Full-length IgH clonal repertoires" is misleading readers. Indeed, 5'-RACE template switch method adopted by the authors enables to sequence full-part of IGHV segment, however, not for full-length IgH which includes IGHC gene. For accurate descriptions, authors should rephrase these sentences.</p><p>(1.2) p4, line 145-147 "The average number of SHMs for IgE clonotypes did not differ significantly between cell subsets but was significantly higher compared to the level of SHM detected for IgM and IgD clonotypes in Bmem."; To compare SHM between IgE and other isotype, it is necessary to compare within the same individual, not between different individuals.</p><p>(1.3) p11 Figure 3A; HBmem and LBmem are separated by PC1 score. What is the biological meaning of this PC1 score? That the meaning to divide clonal lineages into two clusters is unclear confuses the interpretation of data to digest.</p><p>(1.4) p13 Figure 4F; From the branching data, I was not sure whether lower memory lineage and upper ASC lineage are same origin because branching point could be one or two mutations and if so, how do authors confirm these two lineages are from same origin?</p><p>(1.5) p14, line 395-; Probability of nonsynonymous or synonymous mutation will be different based on a kind of original amino acid. Did the authors take into consideration this point to evaluate selection pressure?</p><p>2) To calculate background IGHV gene fragments and the number of shared clonotypes, the authors used the data from several database resources. However, the algorithm for normalizing data from all these multiple sources is not elaborated on, and it worried this reviewer that these normalizing criteria shall be uniform for all these databases, otherwise may influence the weight and contribution, and eventually the conclusion.</p><p>3) The authors analyzed the distribution frequency of IGHV gene fragments based on Jensen-Shannon divergence. To further validate the conclusion, the authors are encouraged to check if other the index of distributional similarity, for example, Wasserstein distance, will affect the current results.</p><p>4) The cohort of 6 healthy donors was not large, and importantly it does not seem to this reviewer that the authors provided details of age, sex, chronic disease, etc. All these messages cannot be missed and instead shall discuss how these factors can influence the conclusion of this manuscript.</p><p><italic>Reviewer #1 (Recommendations for the authors):</italic></p><p>Throughout the manuscript, the conclusion of each section will help to catch the findings of this study. Please consider including short summaries at the end of each section.</p><p>p3, line 100; p6, line 179; p17, line 459; p19, line 581; Strictly speaking, "Full-length IgH clonal repertoires" is misleading readers. Indeed, 5'-RACE template switch method adopted by the authors enables to sequence full-part of IGHV segment, however, not for full-length IgH which includes IGHC gene. For accurate descriptions, authors should rephrase these sentences.</p><p>p4, line 145-147 "The average number of SHMs for IgE clonotypes did not differ significantly between cell subsets but was significantly higher compared to the level of SHM detected for IgM and IgD clonotypes in Bmem."; To compare SHM between IgE and other isotype, it is necessary to compare within the same individual, not between different individuals.</p><p>p11 Figure 3A; HBmem and LBmem are separated by PC1 score. What is the biological meaning of this PC1 score? That the meaning to divide clonal lineages into two clusters is unclear confuses the interpretation of data to digest.</p><p>p13 Figure 4F; From the branching data, I was not sure whether lower memory lineage and upper ASC lineage are same origin because branching point could be one or two mutations and if so, how do authors confirm these two lineages are from same origin?</p><p>p14, line 395-; Probability of nonsynonymous or synonymous mutation will be different based on a kind of original amino acid. Did the authors take into consideration this point to evaluate selection pressure?</p><p>Adding thought of classification of B1-type BCR and B2-type BCR to whole analysis will be interesting and give another layer of understanding of naturally occurring human BCR repertoire.</p><p><italic>Reviewer #2 (Recommendations for the authors):</italic></p><p>There are some points as detailed below that shall be considered before publication.</p><p>1) To calculate background IGHV gene fragments and the number of shared clonotypes, the authors used the data from several database resources. However, the algorithm for normalizing data from all these multiple sources is not elaborated on, and it worried this reviewer that these normalizing criteria shall be uniform for all these databases, otherwise may influence the weight and contribution, and eventually the conclusion.</p><p>2) The authors analyzed the distribution frequency of IGHV gene fragments based on Jensen-Shannon divergence. To further validate the conclusion, the authors are encouraged to check if other the index of distributional similarity, for example, Wasserstein distance, will affect the current results.</p><p>3) The cohort of 6 healthy donors was not large, and importantly it does not seem to this reviewer that the authors provided details of age, sex, chronic disease, etc. All these messages cannot be missed and instead shall discuss how these factors can influence the conclusion of this manuscript.</p></body></sub-article><sub-article article-type="reply" id="sa2"><front-stub><article-id pub-id-type="doi">10.7554/eLife.79254.sa2</article-id><title-group><article-title>Author response</article-title></title-group></front-stub><body><disp-quote content-type="editor-comment"><p>Essential revisions:</p><p>1) Throughout the manuscript, the conclusion of each section will help to catch the findings of this study. Please consider including short summaries at the end of each section.</p></disp-quote><p>To address this comment we have added the following conclusions for each section:</p><p>lines 177-179: “These observations highlight the differences in general characteristics of IGH repertoire between the Bmem and ASC subsets, and demonstrate similarity of IGHV gene usage that differs from that in naive B cells”.</p><p>lines 245-247: “Thus the results demonstrate the level of stability of memory B-cell receptor repertoires and extent of clonal sharing in repertoires of unrelated donors, which might be attributed to exposure to common antigens.”</p><p>lines 311-313: “Thus we observed two types of clonal lineages, representing different stages of an immune response: persisting memory with unswitched IgM isotype (HBmem) and responding lineages rapidly increasing in frequency and producing IgG or IgA antibodies (LBmem).”</p><p>line 370-375: “To summarize, we observed that LBmem lineages had low level of clonotype divergence and large distance of lineage’s ancestor from the germline sequence, assuming their recent origin from a mature clonotype. The temporal dynamics of LBmem, detection of Bmem clonotypes at the time-point prior to the LBmem lineage expansion, and the relationship between HBmem and LBmem on a clonal lineage level suggest that LBmem expansions may result from reactivation of pre-existing memory.”</p><p>line 431-433: “This excess of advantageous SHMs in ancestors of LBmem lineages together with previous observations that LBmem lineages can originate from reactivated memory suggests that reactivation was coupled with new rounds of affinity maturation.”</p><p>The concluding summary for the last section was also included in the first version (lines 463-469 in the revised version).</p><disp-quote content-type="editor-comment"><p>(1.1) p3, line 100; p6, line 179; p17, line 459; p19, line 581; Strictly speaking, "Full-length IgH clonal repertoires" is misleading readers. Indeed, 5'-RACE template switch method adopted by the authors enables to sequence full-part of IGHV segment, however, not for full-length IgH which includes IGHC gene. For accurate descriptions, authors should rephrase these sentences.</p></disp-quote><p>We thank the reviewer for this comment, we have corrected the following phrases (the changes we’ve made are highlighted):</p><list list-type="bullet"><list-item><p>line 86 “…that provides full-length IgH variable region sequences with…”</p></list-item><list-item><p>lines 98-99 “… we obtained IGH clonal repertoires using a 5’-RACE-based protocol”</p></list-item><list-item><p>lines 186-187 “…obtained IGH clonal repertoires by sequencing respective cDNA libraries covering full-length IGH variable domain.”</p></list-item><list-item><p>line 484 “Using advanced library preparation technology, we performed a longitudinal study of BCR repertoires…”</p></list-item><list-item><p>line 630 “IGH cDNA libraries and sequencing”</p></list-item></list><disp-quote content-type="editor-comment"><p>(1.2) p4, line 145-147 "The average number of SHMs for IgE clonotypes did not differ significantly between cell subsets but was significantly higher compared to the level of SHM detected for IgM and IgD clonotypes in Bmem."; To compare SHM between IgE and other isotype, it is necessary to compare within the same individual, not between different individuals.</p></disp-quote><p>The observation regarding IgE clonotypes was the same for all individual repertoires. To clarify this we have added a new supplementary figure (Supplementary Figure S3 - Figure 1—figure supplement 3) illustrating comparison of SHM rate in Bmem subset between isotypes within each individual. IgE clonotypes, when detected, have on average more SHMs compared with IgM or IgD within the same individual repertoire.</p><disp-quote content-type="editor-comment"><p>(1.3) p11 Figure 3A; HBmem and LBmem are separated by PC1 score. What is the biological meaning of this PC1 score? That the meaning to divide clonal lineages into two clusters is unclear confuses the interpretation of data to digest.</p></disp-quote><p>To describe the clonal relationships between cell subsets we investigated cell subset composition of most abundant clonal lineages. We found that the cell subtypes and isotypes were unequally distributed: some clonal lineages were mostly composed of memory B-cell clonotypes of non-switched isotype IgM, while the others were largely composed of ASCs and enriched in IgG and IgA clonotypes (Supplementary Figure S6B - Figure 3—figure supplement 1B). To understand the nature of such bimodal distribution and compare other features of clonal lineages, differing in cellular and isotype composition, we used PCA and k-mean clustering to split clonal lineages into two clusters based on these features. As expected, fractions of Bmem subset and IgM isotype were main contributors in PC1 score, as shown by arrows on PCA plot (Figure 3A), which represent projections of the corresponding variables onto the two-dimensional PCA plane.</p><p>We agree that this logic was presented not clearly enough, so we added a plot with distributions of cell subset and isotype fractions in clonal lineages (Supplementary Figure S6) and modified the narrative in the corresponding paragraph of the text (lines 277-290):</p><p>“First we asked how B cell subsets and isotypes were represented in these most abundant clonal lineages. The clonal lineages were mostly composed of memory B-cell clonotypes of non-switched isotype IgM or were largely composed of ASCs, and enriched in IgG and IgA clonotypes (Supplementary Figure S6B - Figure 3—figure supplement 1B). To investigate the nature of such bimodal distribution and perform comparative analysis of these two types of clonal lineages we divided them into two large clusters using k-means clustering algorithm, based on the proportion of represented cell subsets and BCR isotypes(Figure 3A, B, Supplementary Figure S7A - Figure 3—figure supplement 2A). The more abundant HBmem cluster included 138 clonal lineages, and was mostly composed of memory B-cell clonotypes of non-switched isotype IgM. Conversely, the smaller LBmem cluster (52 clonal lineages) was more diverse and largely composed of ASCs, and enriched in IgG and IgA clonotypes. The average size of clonal lineages (<italic>i.e.</italic>, the number of unique clonotypes per lineage) did not differ between the HBmem and LBmem clusters (Supplementary Figure S7B - Figure 3—figure supplement 2B), and both clusters were present in repertoires of all donors (Supplementary Figure S7C - Figure 3—figure supplement 2C).”</p><p>Also we mentioned the type of clustering in the caption of the Figure 3A. The former Supplementary Figure S6 was split in two (now Supplementary Figures S6 and S7 - Figure 3—figure supplement 1 and 2) and now includes panel S6B (Figure 3—figure supplement 1B).</p><disp-quote content-type="editor-comment"><p>(1.4) p13 Figure 4F; From the branching data, I was not sure whether lower memory lineage and upper ASC lineage are same origin because branching point could be one or two mutations and if so, how do authors confirm these two lineages are from same origin?</p></disp-quote><p>To address this question we added Supplementary Figure S8 (Figure 4—figure supplement 1) with alignments of CDR regions of clonotypes belonging to the clonal lineage from Figure 4F. Clonal lineages were defined as the group of clonotypes with the same V segment, CDR3 length, and having at least 85% similarity of CDR3 nucleotide sequence. The clonotypes from the lineage shown in Figure 4F passed all of these criteria. The phylogeny for each lineage in the manuscript was reconstructed using the nucleotide sequences of clonotypes in the group.</p><p>As it can be seen from the CDR3 nucleotide sequence alignment, of the positions that could be attributed to the non-templated region in D-J junction there is only one position splitting the tree to the two parts belonging to HBmem and LBmem sublineages. At the same time the variations in the rest 12 positions are either distributed between subtrees or conservative. The distribution of nucleotide variants in the other non-templated region (V-D junction) shows the same. On the amino acid level 5 out of 8 positions of CDR3 region encoded by non-templated nucleotides are either conservative or have similar physicochemical properties. Most changes that led to the divergence of ASC sublineage from the remaining tree occurred in CDR2 region, however CDR2s of ASC sublineage clonotypes carry germline amino acid variants in 4 of 7 positions. It can be suggested that 7V and 6Y in CDR2 and CDR3, respectively, were important for differentiation into ASCs.</p><p>The position of ASC sublineage on lineage phylogeny also supports that LBmem-like clade has the same origin as the remaining part of clonotypes. Indeed, the LBmem-like clade arises from the middle of the tree (the 4th node from the root), indicating that it has a lot of events in common with the other clades; at the same time the maximum number of nodes from the root to the most distant leaf is 8. It reflects that changes, differing the clade from the remaining clonotypes accumulated gradually and required several branching events. In contrast, when two subgroups are artificially joined together and the sequences within the subgroup are more related to each other than the sequences between the subgroups, one would expect divergence of these subgroups much closer to the root of the phylogeny with rapid appearance of changes in clade sequences relative to the root. We added these arguments in the main text as well:</p><p>“Position of ASC sublineage on a distant node from the root of the tree indicates gradual accumulation of SHMs, distinguishing the ASC sublineage from the remaining clonotypes. This fact together with the similarity of CDR3 regions of lineage clonotypes (Supplementary Figure S8 - Figure 4—figure supplement 1) give a reason to conclude that the ASC sublineage has the same origin as the remaining part of the tree with features of HBmem cluster.” (lines 365-375).</p><disp-quote content-type="editor-comment"><p>(1.5) p14, line 395-; Probability of nonsynonymous or synonymous mutation will be different based on a kind of original amino acid. Did the authors take into consideration this point to evaluate selection pressure?</p></disp-quote><p>Yes, we have taken this point into account.To calculate the πNπS we normalized the number of nonsynonymous / synonymous SHMs by the number of nonsynonymous / synonymous sites in the original sequence, which reflects the probability of mutations of a certain type to occur (as specified in Methods section). Number of nonsynonymous / synonymous sites (<inline-formula><mml:math id="sa2m1"><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mrow><mml:msub><mml:mi>N</mml:mi><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:math></inline-formula> and <inline-formula><mml:math id="sa2m2"><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:math></inline-formula>) was calculated according to the classical approach of Nei and Gojobori (Gojobori 1986). In MK test these normalizing numbers are the same for polymorphisms (<inline-formula><mml:math id="sa2m3"><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:math></inline-formula> and <inline-formula><mml:math id="sa2m4"><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:math></inline-formula>) and divergences (<inline-formula><mml:math id="sa2m5"><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:math></inline-formula> and <inline-formula><mml:math id="sa2m6"><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mstyle></mml:math></inline-formula>), and become reduced:</p><p><inline-formula><mml:math id="sa2m7"><mml:mstyle displaystyle="true" scriptlevel="0"><mml:mrow><mml:mi>α</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn><mml:mtext> </mml:mtext><mml:mo>−</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>⋅</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>/</mml:mo></mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mtext> </mml:mtext><mml:mo>=</mml:mo><mml:mtext> </mml:mtext><mml:mn>1</mml:mn><mml:mo>−</mml:mo><mml:mfrac><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub></mml:mfrac><mml:mo>⋅</mml:mo><mml:mfrac><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mi>D</mml:mi><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mfrac></mml:mrow></mml:mstyle></mml:math></inline-formula>.</p><disp-quote content-type="editor-comment"><p>2) To calculate background IGHV gene fragments and the number of shared clonotypes, the authors used the data from several database resources. However, the algorithm for normalizing data from all these multiple sources is not elaborated on, and it worried this reviewer that these normalizing criteria shall be uniform for all these databases, otherwise may influence the weight and contribution, and eventually the conclusion.</p></disp-quote><p>For the IGHV gene fragment usage comparison we utilized the trimmed mean of M-values normalization method (TMM) described in Robinson et al. 2010 (DOI: 10.1186/gb-2010-11-3-r25). TMM normalization primarily accounts for library size variation between samples, with regards to the fact that some extremely differentially expressed genes could bias the commonly used normalization procedures (e.g. using total number of reads as normalization factor). We have included the following passages “… data normalization was performed using trimmed mean of M values method (Robinson, 2010a)” and “…using trimmed mean of M values method for normalization (Robinson et al., 2010a).” at lines 196-197 and line 668, correspondingly, to describe the normalization method used. Besides, we highlighted in the Methods section at the lines 659-661 that the data from Gidoni et al. that we used, was obtained using similar cDNA libraries preparation technique (to clarify this we have added the passage <italic>“</italic> where the IGH cDNA libraries were prepared using 5’RACE-based protocol similar to the protocol used in the current study<italic>”</italic>), so we do not expect substantial technical biases in V usage distributions.</p><p>Regarding the numbers of shared clonotypes, for each comparison we used the equal number of the clonotypes (5000) – 5000 most abundant clonotypes from Bmem or naive repertoires, 5000 randomly selected from Bmem or 5000 <italic>in silico</italic> generated using OLGA software. We have modified the sentence at line 253-258 to describe it more clearly: “The average number of shared clonotypes between repertoires from pairs of unrelated donors for the most abundant Bmem clonotypes, randomly-selected Bmem clonotypes, most abundant clonotypes from naive repertoires of unrelated donors (from Gidoni et al. 2019), or from synthetic repertoires generated with OLGA software; each repertoire in comparison was represented by a fixed number of clonotypes (5000), either most abundant, randomly selected or generated where indicated.”</p><disp-quote content-type="editor-comment"><p>3) The authors analyzed the distribution frequency of IGHV gene fragments based on Jensen-Shannon divergence. To further validate the conclusion, the authors are encouraged to check if other the index of distributional similarity, for example, Wasserstein distance, will affect the current results.</p></disp-quote><p>To address this point we have tested different metrics of distributional similarity (Jensen-Shannon divergence, 2-Wasserstein distance, correlation of IGHV gene frequencies) for analysis of IGHV gene usage similarity, and obtained the same results. Below we provide distributions of 2-Wasserstein distance calculated in the same way as we did with the Janson-Shannon divergence (Figure 2A).</p><p>Distance between repertoires obtained at different time-points from the same or different donors as calculated by 2-Wasserstein distance for IGHV gene frequency distribution. N indicates the number of pairs of repertoires in the group. Comparisons in all panels were performed with two-sided Mann-Whitney U test. * = p ≤ 0.05, ** = p ≤ 0.01, *** = p ≤ 10<sup>-3</sup>, **** = p ≤ 10<sup>-4</sup>.</p><disp-quote content-type="editor-comment"><p>4) The cohort of 6 healthy donors was not large, and importantly it does not seem to this reviewer that the authors provided details of age, sex, chronic disease, etc. All these messages cannot be missed and instead shall discuss how these factors can influence the conclusion of this manuscript.</p></disp-quote><p>We agree with the reviewer on the importance of all the details describing donors in our study. Those details and health status of our donors were provided in Supplementary Table S1. To better describe our cohort we moved the table from supplementary materials to the main text (now named Table 1) and also provided several additions to the Methods section and main text (lines 612-613, 615-616; 94, 119-122).</p><p>Four of six donors had food allergy and/or allergic rhinitis, which is the only known chronic condition in this cohort. To better understand whether this condition may affect our observations, we performed an additional comparative analysis of repertoire structure in donors with and without allergic status. The results of the analysis are combined and provided in the Supplementary Note. Besides that we provided the additional panel in Supplementary Figure S7 (see Supplementary Figure S7D - Figure 3—figure supplement 2D), showing that expansion of LBmem clonal lineages is not synchronized in donors with allergic conditions and does not correspond in dynamics to the pollen season. The following phrase has been added at lines 301-302: “The time point of LBmem frequency burst varied between donors (Supplementary Figure S7D - Figure 3—figure supplement 2D)." We also summarized the results of the additional analysis in the Discussion section as well as discussed the limitations of the cohort size (lines 582-600).</p><disp-quote content-type="editor-comment"><p>Reviewer #1 (Recommendations for the authors):</p><p>Throughout the manuscript, the conclusion of each section will help to catch the findings of this study. Please consider including short summaries at the end of each section.</p></disp-quote><p>To address this comment we have added short summaries to each section (see the Essential Revisions section).</p><disp-quote content-type="editor-comment"><p>p3, line 100; p6, line 179; p17, line 459; p19, line 581; Strictly speaking, "Full-length IgH clonal repertoires" is misleading readers. Indeed, 5'-RACE template switch method adopted by the authors enables to sequence full-part of IGHV segment, however, not for full-length IgH which includes IGHC gene. For accurate descriptions, authors should rephrase these sentences.</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p><disp-quote content-type="editor-comment"><p>p4, line 145-147 "The average number of SHMs for IgE clonotypes did not differ significantly between cell subsets but was significantly higher compared to the level of SHM detected for IgM and IgD clonotypes in Bmem."; To compare SHM between IgE and other isotype, it is necessary to compare within the same individual, not between different individuals.</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p><disp-quote content-type="editor-comment"><p>p11 Figure 3A; HBmem and LBmem are separated by PC1 score. What is the biological meaning of this PC1 score? That the meaning to divide clonal lineages into two clusters is unclear confuses the interpretation of data to digest.</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p><disp-quote content-type="editor-comment"><p>p13 Figure 4F; From the branching data, I was not sure whether lower memory lineage and upper ASC lineage are same origin because branching point could be one or two mutations and if so, how do authors confirm these two lineages are from same origin?</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p><disp-quote content-type="editor-comment"><p>p14, line 395-; Probability of nonsynonymous or synonymous mutation will be different based on a kind of original amino acid. Did the authors take into consideration this point to evaluate selection pressure?</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p><disp-quote content-type="editor-comment"><p>Adding thought of classification of B1-type BCR and B2-type BCR to whole analysis will be interesting and give another layer of understanding of naturally occurring human BCR repertoire.</p></disp-quote><p>We thank the reviewer for the idea, it would be of great interest to add analysis of B1-/B2-type BCRs to the picture. However, the current ambiguity in B1/B2 characteristics does not allow to associate a clonotype with B1/B2 subset just on the basis of BCR sequence. We believe that such attempt should be accompanied with sorting of particular populations as done by Rodriguez-Zhurbenko et al. (DOI: 10.3389/fimmu.2019.00483), where authors sequenced repertoires of sorted human B1-cells to investigate age-associated alterations.</p><disp-quote content-type="editor-comment"><p>Reviewer #2 (Recommendations for the authors):</p><p>There are some points as detailed below that shall be considered before publication.</p><p>1) To calculate background IGHV gene fragments and the number of shared clonotypes, the authors used the data from several database resources. However, the algorithm for normalizing data from all these multiple sources is not elaborated on, and it worried this reviewer that these normalizing criteria shall be uniform for all these databases, otherwise may influence the weight and contribution, and eventually the conclusion.</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p><disp-quote content-type="editor-comment"><p>2) The authors analyzed the distribution frequency of IGHV gene fragments based on Jensen-Shannon divergence. To further validate the conclusion, the authors are encouraged to check if other the index of distributional similarity, for example, Wasserstein distance, will affect the current results.</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p><disp-quote content-type="editor-comment"><p>3) The cohort of 6 healthy donors was not large, and importantly it does not seem to this reviewer that the authors provided details of age, sex, chronic disease, etc. All these messages cannot be missed and instead shall discuss how these factors can influence the conclusion of this manuscript.</p></disp-quote><p>The comment is addressed in the Essential Revisions section.</p></body></sub-article></article>