---
authors:
  - givenNames:
      - Emil
      - Bargmann
    familyNames:
      - Madsen
    type: Person
    affiliations:
      - name: Aarhus
        address:
          addressCountry: Denmark
          type: PostalAddress
        type: Organization
  - givenNames:
      - Mathias
      - Wullum
    familyNames:
      - Nielsen
    type: Person
    affiliations:
      - name: Copenhagen
        address:
          addressCountry: Denmark
          type: PostalAddress
        type: Organization
  - givenNames:
      - Josefine
    familyNames:
      - Bjørnholm
    type: Person
    affiliations:
      - name: Aarhus
        address:
          addressCountry: Denmark
          type: PostalAddress
        type: Organization
  - givenNames:
      - Reshma
    familyNames:
      - Jagsi
    type: Person
    affiliations:
      - name: Ann Arbor
        address:
          addressCountry: United States
          type: PostalAddress
        type: Organization
  - givenNames:
      - Jens
      - Peter
    familyNames:
      - Andersen
    type: Person
    emails:
      - jpa@ps.au.dk
    affiliations:
      - name: Aarhus
        address:
          addressCountry: Denmark
          type: PostalAddress
        type: Organization
editors:
  - givenNames:
      - Peter
    familyNames:
      - Rodgers
    type: Person
    affiliations:
      - name: eLife
        address:
          addressCountry: United Kingdom
          type: PostalAddress
        type: Organization
datePublished:
  value: '2022-03-16'
  type: Date
dateReceived:
  value: '2021-12-21'
  type: Date
dateAccepted:
  value: '2022-03-15'
  type: Date
title: >-
  Author-level data confirm the widening gender gap in publishing rates during
  COVID-19
description: >-
  Publications are essential for a successful academic career, and there is
  evidence that the COVID-19 pandemic has amplified existing gender disparities
  in the publishing process. We used longitudinal publication data on 431,207
  authors in four disciplines - basic medicine, biology, chemistry and clinical
  medicine - to quantify the differential impact of COVID-19 on the annual
  publishing rates of men and women. In a difference-in-differences analysis, we
  estimated that the average gender difference in publication productivity
  increased from –0.26 in 2019 to –0.35 in 2020; this corresponds to the output
  of women being 17% lower than the output of men in 2109, and 24% lower in
  2020. An age-group comparison showed a widening gender gap for both
  early-career and mid-career scientists. The increasing gender gap was most
  pronounced among highly productive authors and in biology and clinical
  medicine. Our study demonstrates the importance of reinforcing institutional
  commitments to diversity through policies that support the inclusion and
  retention of women in research.
isPartOf:
  volumeNumber: '11'
  isPartOf:
    title: eLife
    issns:
      - 2050-084X
    identifiers:
      - name: nlm-ta
        propertyID: https://registry.identifiers.org/registry/nlm-ta
        value: elife
        type: PropertyValue
      - name: publisher-id
        propertyID: https://registry.identifiers.org/registry/publisher-id
        value: eLife
        type: PropertyValue
    publisher:
      name: eLife Sciences Publications, Ltd
      type: Organization
    type: Periodical
  type: PublicationVolume
licenses:
  - url: http://creativecommons.org/licenses/by/4.0/
    content:
      - content:
          - 'This article is distributed under the terms of the '
          - content:
              - Creative Commons Attribution License
            target: http://creativecommons.org/licenses/by/4.0/
            type: Link
          - >-
            , which permits unrestricted use and redistribution provided that
            the original author and source are credited.
        type: Paragraph
    type: CreativeWork
keywords:
  - meta-research
  - scientific productivity
  - publishing
  - gender bias
  - COVID-19
  - academia
  - None
identifiers:
  - name: publisher-id
    propertyID: https://registry.identifiers.org/registry/publisher-id
    value: '76559'
    type: PropertyValue
  - name: doi
    propertyID: https://registry.identifiers.org/registry/doi
    value: 10.7554/eLife.76559
    type: PropertyValue
  - name: elocation-id
    propertyID: https://registry.identifiers.org/registry/elocation-id
    value: e76559
    type: PropertyValue
fundedBy:
  - identifiers:
      - value: DFF-0133-00165B
        type: PropertyValue
    funders:
      - name: Samfund og Erhverv, Det Frie Forskningsråd
        type: Organization
    type: MonetaryGrant
  - identifiers:
      - value: AUFF-F-2018-7-5
        type: PropertyValue
    funders:
      - name: Aarhus Universitets Forskningsfond
        type: Organization
    type: MonetaryGrant
  - identifiers:
      - value: 9130-00029B
        type: PropertyValue
    funders:
      - name: Independent Research Fund Denmark
        type: Organization
    type: MonetaryGrant
about:
  - name: Computational and Systems Biology
    type: DefinedTerm
  - name: Medicine
    type: DefinedTerm
genre:
  - Feature Article
bibliography: elife-76559.references.bib
---

# Introduction

Gender disparities in academic publishing have widened during the COVID-19 pandemic. The proportion of preprints and manuscript submissions with women as authors has decreased ([@bib9]; [@bib26]; [@bib34]; [@bib44]; [@bib49]), as has the proportions of preprints and published articles with women as either the first author or the senior author ([@bib2]; [@bib25]; [@bib30]; [@bib35]; [@bib42]). Gender gaps in self-reported research activities have also increased ([@bib2]; [@bib25]; [@bib30]; [@bib35]; [@bib42]). However, the longitudinal effects of the pandemic on differences in annual publication outputs remain uncertain. In this study, we used individual-level panel data on the publication activities of 431,207 authors globally to quantify the differential impact of COVID-19 on the publishing rates of women and men.

Research on gender and publication productivity suggests that women (on average) publish fewer articles than men ([@bib31]), although the magnitude of this difference varies by career stage, discipline and country, and has diminished over time ([@bib24]; [@bib43]; [@bib52]). The gender imbalance in publishing rates should be understood in the context of broader disparities in the science system. Structural variables such as employment rank, access to resources, university prestige, appointment type, teaching loads ([@bib15]; [@bib46]) and available time for research ([@bib19]; [@bib29]) all partially explain the observed gender imbalances in publication productivity ([@bib1]; [@bib4]; [@bib51]). In addition, research finds that women scientists (compared to men) tend to span more topics in their research activities, face stricter editorial standards in peer reviewing ([@bib22]), and take on greater shares of parenthood responsibilities ([@bib10]), which also likely perpetuate publishing disparities.

Recent research has identified two primary mechanisms through which the pandemic may have amplified existing disparities in publishing ([@bib27]). First, evidence from national and international surveys indicates that women scientists have taken the lion’s share of the extra childcare and domestic responsibilities imposed by lockdowns of schools and daycares ([@bib11]; [@bib45]; [@bib53]). According to surveys of self-reported research activities, women scientists – especially those with young dependents – have seen notable productivity decreases in the wake of the pandemic ([@bib11]; [@bib36]; [@bib45]). Second, transitions to online teaching during university lockdowns required extra hours of planning and preparation and may have affected women scientists more than men due to observed disparities in average teaching loads ([@bib3]; [@bib15]; [@bib27]; [@bib46]). Survey-based evidence from the United States also indicates that the extra time spent on teaching partially accounts for observed decreases in scientists’ self-reported publication rates ([@bib3]). In clinical medicine, service demands related to care for COVID-19 patients and transitions to virtual care delivery for many others may also have disproportionately affected women, who are more likely to be represented on clinician-educator rather than traditional tenure tracks at medical schools ([@bib32]).

This study is, to our knowledge, the first to quantify the differential impact of COVID-19 on the annual publishing rates of women and men. We used a linked dataset of 431,207 authors and 2,113,108 publications and a difference-in-differences specification to estimate how the gender difference in average publishing rates changed from 2019–2020.

We rely on author-disambiguated publication data from Clarivate’s Web of Science, restricting our focus to scientists with >2 publications within basic medicine, biology, chemistry and clinical medicine. We chose these fields as they are well-represented in Web of Science (more than 90% of references are included in Web of Science), their primary knowledge production mode is through journal publication (unlike, for example, computer science, many fields of engineering, and the humanities), research is comparatively collaborative (although some areas of clinical research have somewhat more authors), publishing is relatively fast (compared to, for example, the social sciences). Basic medicine, biology and clinical medicine also have some of the highest shares of women scientists in the natural sciences.

We report annual, per-author publishing rates based on a full and fractional counting. The full counting gives the raw sum of all papers published by a scientist in a given year. The fractional counting gives the sum of the reciprocal of the number of authors per paper published by a scientist.

# Results

The following results use a main sample consisting of two scientist cohorts, one with first publication year in 2009 or 2010 ("mid-career", n = 137,767) and one with first publication year in 2016 or 2017 ("early-career", n = 293,440). Unless mentioned otherwise, the combined cohort (n = 431,207) is used. A third, counterfactual cohort (n = 276,793) is used to contrast the early-career sample, as a means of estimating the expected attrition in the early-career stage, when a proportion of scientists leave academia. Each analysis referring to a "treatment", indicated in figures as a dotted line between 2019 and 2020, refers to the changes in working environments in 2020 due to the COVID-19 pandemic.

## Descriptive results

Our analysis suggests that gender disparities in annual publication outputs have widened during COVID-19. A descriptive comparison of changes in publishing rates in 2020 compared to 2019 ([Figure 1](#fig1)) indicates a 15% decrease in women’s average full- and fractional-count publication output and a 6%–7% decrease in men’s average full- and fractional-count publication output.

chunk: Figure 1.
:::
### Average publication output by gender and year.

Differences are in percentages of average publication rates in 2019. Results are presented for full and fractionalized publication counts. Men experience a smaller productivity decrease in 2020 compared to 2019 (6.3%) than women (14.9%) using full counts of publications. For fractional counts (each paper counts as a fraction of the number of co-authors), the difference in decrease is greater, with a 7.1% decrease for men and 14.7% decrease for women. Average publication counts are presented with 99% confidence bounds.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20
# Setup
## Libraries
library(ggplot2)
library(dplyr)
library(broom)
library(stringr)
library(tidyr)
library(scales)
library(patchwork)
library(fixest)
library(jtools)
library(margins)
library(sandwich)
library(rio)

## Settings for plotting
orange <- "#FF851B" # Hex codes for colours
maroon <- "#85144b"
olive <- "#3D9970"
navy <- "#001f3f"

## Plot themes
font_fam <- "Helvetica" # Set preferred font

ggplot2::update_geom_defaults(geom = "text", list(family = font_fam))
ggplot2::update_geom_defaults(geom = "label", list(family = font_fam))

theme_ebm <- function() {
  ggthemes::theme_base(base_size = 12, base_family = font_fam) %+replace% ggplot2::theme(
    plot.background = ggplot2::element_blank(),
    axis.ticks = ggplot2::element_line(lineend = "square"), 
    axis.ticks.length = ggplot2::unit(0.35, "lines"), 
    axis.text = ggplot2::element_text(size = 12)
  )
}

theme_ebm_grid <- function() {
  ggthemes::theme_base(base_size = 12, base_family = font_fam) %+replace% ggplot2::theme(
    plot.background = ggplot2::element_blank(), 
    axis.ticks = ggplot2::element_line(lineend = "square"), 
    axis.ticks.length = ggplot2::unit(0.5, "lines"), 
    axis.text = ggplot2::element_text(size = 12),
    panel.grid.major = element_line(colour = "grey85", linetype = "dotted")
  )
}

theme_ebm_bar <- function() {
  ggthemes::theme_base(base_size = 12, base_family = font_fam) %+replace% ggplot2::theme(
    plot.background = ggplot2::element_blank(),
    panel.background = ggplot2::element_blank(),
    panel.border = ggplot2::element_blank(),
    axis.ticks = ggplot2::element_line(lineend = "square"), 
    axis.ticks.length = ggplot2::unit(0.5, "lines"), 
    axis.text = ggplot2::element_text(size = 12),
    axis.line.x = ggplot2::element_line(),
    panel.grid.major.y = element_line(colour = "grey85", linetype = "dotted")
  )
}

# Figure 1

load("data/fig1_data.Rdata")

# Change labels

fig1_data$gender_category[fig1_data$gender_category == "Female"] <- "Women"
fig1_data$gender_category[fig1_data$gender_category == "Male"] <- "Men"

# Setup labels for percentage change
labels <- fig1_data %>% 
  filter(pub_year == 2020)

growth <- tibble(gender_category = c("Women", "Men"),
                 pub_year = c("2020", "2020"), 
                 mean_growth_full = as.numeric(c((fig1_data[2, 3] - fig1_data[1, 3])/fig1_data[1, 3], (fig1_data[4, 3] - fig1_data[3, 3])/fig1_data[3, 3])),
                 total_growth_full = as.numeric(c((fig1_data[2, 5] - fig1_data[1, 5])/fig1_data[1, 5], (fig1_data[4, 5] - fig1_data[3, 5])/fig1_data[3, 5])),
                 mean_growth_frac = as.numeric(c((fig1_data[2, 4] - fig1_data[1, 4])/fig1_data[1, 4], (fig1_data[4, 4] - fig1_data[3, 4])/fig1_data[3, 4])),
                 total_growth_frac = as.numeric(c((fig1_data[2, 6] - fig1_data[1, 6])/fig1_data[1, 6], (fig1_data[4, 6] - fig1_data[3, 6])/fig1_data[3, 6]))
)

labels <- labels %>% 
  left_join(growth, by = c("gender_category", "pub_year"))

# Full publication counts

t_value <- 2.576 # 99 % (for 99.9 % insert 3.291)

bar_full <- ggplot(fig1_data, aes(x = gender_category, y = mean_pubs_full)) + 
  geom_col(aes(fill = pub_year, color = pub_year), position = position_dodge(0.75), width = 0.5) + 
  geom_linerange(aes(ymin = mean_pubs_full - (t_value * se_pubs_full), ymax = mean_pubs_full + (t_value * se_pubs_full)), position = position_dodge2(0.75)) +
  geom_hline(yintercept = 0, size = 1) +
  geom_text(data = labels, aes(x = gender_category, y = mean_pubs_full, label = paste0(round(mean_growth_full, 3)*100, "%")), nudge_x = 0.2, nudge_y = 0.1) +
  scale_y_continuous(limits = c(0, 2), breaks = seq(0, 1.5, 0.5)) +
  scale_color_manual(values = c("black", "grey30"), guide = "legend") +
  scale_fill_manual(values = c("white", "grey30"), guide = "legend") +
  labs(x = "", y = "Avg. number of publications", title = "Full counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.ticks.y = element_blank(),
        legend.background = element_rect(fill = "transparent"),
        legend.position = c(0.2, 0.9),
        legend.direction = "horizontal",
        legend.title = element_blank(),
        plot.title = element_text(hjust = 0.5))

# Fractionalized publication counts

bar_frac <- ggplot(fig1_data, aes(x = gender_category, y = mean_pubs_frac)) + 
  geom_col(aes(fill = pub_year, color = pub_year), position = position_dodge(0.75), width = 0.5) + 
  geom_linerange(aes(ymin = mean_pubs_frac - (t_value * se_pubs_frac), ymax = mean_pubs_frac + (t_value * se_pubs_frac)), position = position_dodge2(0.75)) +
  geom_hline(yintercept = 0, size = 1) +
  geom_text(data = labels, aes(x = gender_category, y = mean_pubs_frac, label = paste0(round(mean_growth_frac, 3)*100, "%")), nudge_x = 0.2, nudge_y = 0.015) +
  scale_color_manual(values = c("black", "grey30")) +
  scale_fill_manual(values = c("white", "grey30")) +
  labs(x = "", y = "", title = "Fractionalized counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.x = element_blank(),
        axis.ticks.y = element_blank(),
        legend.position = "none",
        legend.direction = "horizontal",
        legend.title = element_blank(),
        plot.title = element_text(hjust = 0.5))

bar_full | bar_frac + plot_layout(guides = "collect")
```
:::
{#fig1}

## Difference-in-differences estimates

[Figure 2](#fig2) displays the dynamic effects of the COVID-19 pandemic and summarizes the main result of the difference-in-differences estimation. As shown in panel A, the gender difference in annual publishing rates remained relatively stable between 2017 and 2019 (implying parallel trends prior to COVID-19), while increasing in 2020. From 2019 to 2020, the average-marginal gender difference increased from –0.260 (corresponding to a 17% lower output for women than for men) to –0.354 (corresponding to a 24% lower output for women than for men) in full-count output. [Figure 2—figure supplement 1](#fig2s1) presents results from a complementary analysis with fractional-count publication output as outcome and shows a change in the average-marginal gender difference from –0.048 (corresponding to a 22% lower output for women than for men) to 0.059 (corresponding to a 27% lower output for women than for men).

chunk: Figure 2.
:::
### Dynamic effects of the COVID-19 pandemic on women’s and men’s publication productivity.

Panel A shows the estimated average gender difference in publication rates by year. Each point shows the relative difference between men and women per year, with 99.9% confidence bounds shown as a gray area around the line. From 2019 to 2020, the average-marginal gender difference increased from –0.260 (17% lower output for women) to –0.354 (24% lower output for women). Panel B shows the predicted publishing rates for men and women authors, with solid lines showing the trend per gender, and the dashed, orange line showing the counterfactual trend for women if they had similar 2019–2020 trajectories as men (i.e. the trend for men is projected to the 2019 estimate for women). The difference between the dashed line and the straight line in Panel B specifies the average treatment effect for women. Point estimates are reported with 99.9% confidence bounds, with robust standard errors clustered at the individual-author level. For information on how average marginal and predicted values are calculated, please refer to Materials and Methods: Difference-in-Differences model.

OLS linear regression with full count as dependent variable.OLS linear regression with fractional count as dependent variable.Poisson regression with full count as dependent variable.Negative binomial regression with full count as dependent variable.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20
load("data/ame_results.Rdata")
load("data/pred_results.Rdata")

# Average marginal effect on full publication counts

ame_linear_full <- ggplot(data = ols_ame_full, aes(x = as.numeric(time_factor), y = AME, group = 1)) +
  geom_vline(xintercept = 4.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = lower, ymax = upper), fill = "grey50", alpha = 1/4) +
  geom_line(size = 1.5)+
  geom_point(shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #geom_text(aes(x = 3.75 , y = 0.05, label = "Lock-down")) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(-0.6, 0.1), breaks = c(-0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0, 0.1)) +
  labs(x = "", y = "Avg. differences", title = "A. Differences in publication count") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        plot.title = element_text(hjust = 0))

# Average predicted full publication counts

labels <- tibble(group = c("Men", "Women"), # Labels for figure
                 x = c(-3.5, -3.5),
                 y = c(1.25, 0.65))

counterfactual_trend <- tibble(pub_year = c(-1, 0), # Effect if women researchers had same drop as men
                               trend = as.numeric(c(ols_pred_full[9, 1], ols_pred_full[9, 1] + (ols_pred_full[5, 1] - ols_pred_full[4, 1])))
)

predicted_linear_full <- ols_pred_full %>% 
  mutate(gender_text = case_when(gender_num == 0 ~ "Men",
                                 gender_num == 1 ~ "Women")) %>% 
  ggplot(aes(x = as.numeric(time_factor), y = n_pubs)) +
  geom_vline(xintercept = -0.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = ymin, ymax = ymax, fill = gender_text), alpha = 1/2) +
  geom_line(aes(colour = gender_text), size = 1.5) +
  geom_line(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, linetype = "dashed", size = 1) +
  geom_point(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, size = 2.5) +
  geom_point(aes(colour = gender_text, fill = gender_text), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  geom_text(data = labels, aes(x = x, y = y, label = group, colour = group), size = 5) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(0, 2)) +
  scale_fill_manual(values = c(orange, maroon)) +
  scale_color_manual(values = c(orange, maroon)) +
  labs(x = "", y = "Avg. predicted\npublication counts", title = "B. Predicted publication counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        legend.position = "none")

# Combined figure

ame_linear_full + predicted_linear_full
```
:::
{#fig2}

chunk: Figure 2—figure supplement 1.
:::
### Corresponding analysis with fractional counts.

Dynamic effects of the COVID-19 pandemic on women’s and men’s fractional-count output. Panel A shows the estimated average gender difference in fractional-count publication rates by year. Panel B shows the predicted fractional-count publishing rates for male and female authors. Dashed, colored, lines represent the counterfactual trend for women if they had similar 2019–2020 trajectories as men. The difference between the dashed line and the straight line in Panel B specify the average treatment effect for women. Point estimates are reported with 99.9% confidence bounds, with robust standard errors clustered at the individual-author level.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20

# Average marginal effect on fractional publication counts

ame_linear_frac <- ggplot(data = ols_ame_frac, aes(x = as.numeric(time_factor), y = AME, group = 1)) +
  geom_vline(xintercept = 4.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = lower, ymax = upper), fill = "grey50", alpha = 1/4) +
  geom_line(size = 1.5)+
  geom_point(shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #geom_text(aes(x = 3.75 , y = 0.05, label = "Lock-down")) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(-0.1, 0.05), breaks = c(-0.1, -0.05, 0, 0.05)) +
  labs(x = "", y = "Avg. differences", title = "A. Differences in publication count") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        plot.title = element_text(hjust = 0))

# Average predicted fractional publication counts

labels <- tibble(group = c("Men", "Women"),
                 x = c(-3.5, -3.5),
                 y = c(.25, 0.1))

counterfactual_trend <- tibble(pub_year = c(-1, 0), # effect if female researchers had same drop as male
                               trend = as.numeric(c(ols_pred_frac[9, 1], ols_pred_frac[9, 1] + (ols_pred_frac[5, 1] - ols_pred_frac[4, 1])))
)

predicted_linear_frac <- ols_pred_frac %>% 
  mutate(gender_text = case_when(gender_num == 0 ~ "Men",
                                 gender_num == 1 ~ "Women")) %>% 
  ggplot(aes(x = as.numeric(time_factor), y = n_pubs_frac)) +
  geom_vline(xintercept = -0.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = ymin, ymax = ymax, fill = gender_text), alpha = 1/2) +
  geom_line(aes(colour = gender_text), size = 1.5) +
  geom_line(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, linetype = "dashed", size = 1) +
  geom_point(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, size = 2.5) +
  geom_point(aes(colour = gender_text, fill = gender_text), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  geom_text(data = labels, aes(x = x, y = y, label = group, colour = group), size = 5) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(0, .3)) +
  scale_fill_manual(values = c(orange, maroon)) +
  scale_color_manual(values = c(orange, maroon)) +
  labs(x = "", y = "Avg. predicted\npublication counts", title = "B. Predicted publication counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        legend.position = "none")

# Combined figure
ame_linear_frac + predicted_linear_frac
```
:::
{#fig2s1}

To verify that the change in the gender productivity gap was in fact due to COVID-19 and did not represent a more generic dip in women’s productivity (compared to men’s) during the fifth year of their publication career, we ran a counterfactual analysis for a sample of researchers, who published their first paper in 2011. For this sample, we observed a small but consistent annual increase in the marginal gender difference across years (from 2011–2015). In this case, the gender difference in productivity increased by 1/20 of a full publication (full count: –0.05, 99% CI: –0.0665; –0.0337) between year four (2014) and five (2015), amounting to 53% of the treatment effect observed in [Figure 2](#fig2).

## Career-stage differences

Research suggests that the working conditions of early-career women scientists have been especially affected by the pandemic ([@bib2]; [@bib28]). We examined this question by conducting sub-group analyses by career-age. As shown in [Figure 3](#fig3) the widening gender gap was salient for early-career scientists with four years of publication experience as well as for mid-career scientists with ten years of publication experience. From 2019 to 2020, the average marginal publication disadvantage for early-career women increased from –0.133 (corresponding to an 11% lower output for women than for men) to –0.20 (corresponding to an 18% lower output for women than for men) in full-count output. In comparison, the average marginal publication disadvantage for mid-career women changed from –0.452 (corresponding to a 21% lower output for women than for men) to –0.592 (corresponding to a 27% lower output for women than for men). This is a relative increase in the gender gap of 61% for early-career scientists and 29% for mid-career scientists. We obtained comparable results in an age-differentiated analysis with fractional-count publications as outcome ([Figure 3—figure supplement 1](#fig3s1)).

chunk: Figure 3.
:::
### Dynamic effects of the COVID-19 pandemic on the average gender gap in annual publishing rates, by career age.

Panels A and B show the estimated average gender difference in full-count publication rates by year for early-career and mid-career researchers. Panels C and D show men’s and women’s predicted full-count publication rates per year by author status (early-career vs. mid-career researcher). Point estimates are reported with 99.9% confidence bounds and robust standard errors clustered at the individual-author level. For information on how average marginal and predicted values are calculated, please refer to Materials and Methods: Difference-in-Differences model.

OLS linear regression of the early-career sample, with full count as dependent variable.OLS linear regression of the mid-career sample, with full count as dependent variable.OLS linear regression of the early-career sample, with fractional count as dependent variable.OLS linear regression of the mid-career sample, with fractional count as dependent variable.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20
load("data/ame_results.Rdata")
load("data/pred_results.Rdata")

# Average marginal effect on full publication counts by sample

ols_pred_full_ab <- rbind(ols_pred_full_a, ols_pred_full_b)
ols_ame_full_ab <- rbind(ols_ame_full_a, ols_ame_full_b)
ols_ame_full_ab$type[ols_ame_full_ab$type == "Senior researcher"] <- "B. Mid-career researcher" # Redefine labels
ols_ame_full_ab$type[ols_ame_full_ab$type == "Early career researcher"] <- "A. Early career researcher"
ols_pred_full_ab$type[ols_pred_full_ab$type == "Senior researcher"] <- "D. Mid-career researcher"
ols_pred_full_ab$type[ols_pred_full_ab$type == "Early career researcher"] <- "C. Early career researcher"

ols_ame_full_samples <- ggplot(data = ols_ame_full_ab, aes(x = as.numeric(time_factor), y = AME, group = 1)) +
  geom_vline(xintercept = 4.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = lower, ymax = upper), fill = "grey50", alpha = 1/4) +
  geom_line(size = 1.5)+
  geom_point(shape = 21, fill = "white", size = 2.5, stroke = 2) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(-0.65, 0.1), breaks = c(-0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0, 0.1)) +
  labs(x = "", y = "Avg. differences", title = "Differences in publication count") +
  facet_wrap(~ type) +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        panel.spacing = unit(2, "lines"),
        plot.title = element_text(hjust = 0.5)
  )


# Average predicted full publication counts by sample

labels <- tibble(group = rep(c("Men", "Women"), 2), # Labels for plot
                 type = c("C. Early career researcher", "C. Early career researcher", "D. Mid-career researcher", "D. Mid-career researcher"),
                 x = c(-3.5, -2.5, NA, NA),
                 y = c(1.15, 0.45, NA, NA))

counterfactual_trend <- tibble(type = c("C. Early career researcher", "C. Early career researcher", "D. Mid-career researcher", "D. Mid-career researcher"),
                               pub_year = c(-1, 0, -1, 0),
                               trend = as.numeric(c(ols_pred_full_ab[19, 1], ols_pred_full_ab[19, 1] + (ols_pred_full_ab[15, 1] - ols_pred_full_ab[14, 1]), # Early career counterfactual
                                                    ols_pred_full_ab[9, 1], ols_pred_full_ab[9, 1] + (ols_pred_full_ab[5, 1] - ols_pred_full_ab[4, 1]))) # Mid-career counterfactual
)

ols_pred_full_samples <- ols_pred_full_ab %>% 
  mutate(gender_text = case_when(gender_num == 0 ~ "Men",
                                 gender_num == 1 ~ "Women")) %>% 
  ggplot(aes(x = as.numeric(time_factor), y = n_pubs)) +
  geom_vline(xintercept = -0.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = ymin, ymax = ymax, fill = gender_text), alpha = 1/4) +
  geom_line(aes(colour = gender_text), size = 1.5) +
  geom_line(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, linetype = "dashed", size = 1) +
  geom_point(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, size = 2.5) +
  geom_point(aes(colour = gender_text, fill = gender_text), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  geom_text(data = labels, aes(x = x, y = y, label = group, colour = group), size = 5) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(0, 2.5)) +
  scale_fill_manual(values = c(orange, maroon)) +
  scale_color_manual(values = c(orange, maroon)) +
  labs(x = "", y = "Avg. predicted\npublication counts", title = "Predicted publication counts") +
  facet_wrap(~ type) +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        panel.spacing = unit(2, "lines"),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5))

# Combined figure

ols_ame_full_samples / ols_pred_full_samples
```
:::
{#fig3}

chunk: Figure 3—figure supplement 1.
:::
### Corresponding analysis with fractional counts.

The upper panels show the estimated average gender difference in fractional-count publication rates by year for early-career and senior researchers. The lower panels show men’s and women’s predicted fractional-count publication rates per year by author status (early-career vs. senior researcher). Point estimates are reported with 99.9% confidence bounds, with robust standard errors clustered at the individual-author level.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20
# Further data setup

ols_pred_frac_ab <- rbind(ols_pred_frac_a, ols_pred_frac_b) # Combine results on A and B sample
ols_ame_frac_ab <- rbind(ols_ame_frac_a, ols_ame_frac_b)
ols_ame_frac_ab$type[ols_ame_frac_ab$type == "Senior researcher"] <- "B. Mid-career researcher" # Fix sample labels
ols_ame_frac_ab$type[ols_ame_frac_ab$type == "Early career researcher"] <- "A. Early career researcher"
ols_pred_frac_ab$type[ols_pred_frac_ab$type == "Senior researcher"] <- "D. Mid-career researcher"
ols_pred_frac_ab$type[ols_pred_frac_ab$type == "Early career researcher"] <- "C. Early career researcher"

# Average marginal effect on fractional publication counts by sample

ols_ame_frac_samples <- ggplot(data = ols_ame_frac_ab, aes(x = as.numeric(time_factor), y = AME, group = 1)) +
  geom_vline(xintercept = 4.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = lower, ymax = upper), fill = "grey50", alpha = 1/4) +
  geom_line(size = 1.5)+
  geom_point(shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #geom_text(aes(x = 3.75 , y = 0.05, label = "Lock-down")) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(-0.15, 0.05), breaks = c(-0.15, -0.1, -0.05, 0, 0.05)) +
  labs(x = "", y = "Avg. differences", title = "Differences in publication count") +
  facet_wrap(~ type) +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        panel.spacing = unit(2, "lines"),
        plot.title = element_text(hjust = 0.5))

# Average predicted fractional publication counts by sample

labels <- tibble(group = rep(c("Men", "Women"), 2),
                 type = c("C. Early career researcher", "C. Early career researcher", "D. Mid-career researcher", "D. Mid-career researcher"),
                 x = c(-3.5, -3.5, NA, NA),
                 y = c(.2, 0.07, NA, NA))

counterfactual_trend <- tibble(type = c("C. Early career researcher", "C. Early career researcher", "D. Mid-career researcher", "D. Mid-career researcher"),
                               pub_year = c(-1, 0, -1, 0),
                               trend = as.numeric(c(ols_pred_frac_ab[19, 1], ols_pred_frac_ab[19, 1] + 
                                                      (ols_pred_frac_ab[15, 1] - ols_pred_frac_ab[14, 1]), # Early career counterfactual
                                                    ols_pred_frac_ab[9, 1], ols_pred_frac_ab[9, 1] + 
                                                      (ols_pred_frac_ab[5, 1] - ols_pred_frac_ab[4, 1]))) # Senior counterfactual
)

ols_pred_frac_samples <- ols_pred_frac_ab %>% 
  mutate(gender_text = case_when(gender_num == 0 ~ "Men",
                                 gender_num == 1 ~ "Women")) %>% 
  ggplot(aes(x = as.numeric(time_factor), y = n_pubs_frac)) +
  geom_vline(xintercept = -0.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = ymin, ymax = ymax, fill = gender_text), alpha = 1/4) +
  geom_line(aes(colour = gender_text), size = 1.5) +
  geom_line(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, linetype = "dashed", size = 1) +
  geom_point(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, size = 2.5) +
  geom_point(aes(colour = gender_text, fill = gender_text), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #geom_text(aes(x = -0.1 , y = 2, label = "Lock-down")) +
  geom_text(data = labels, aes(x = x, y = y, label = group, colour = group), size = 5) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(0, .4)) +
  scale_fill_manual(values = c(orange, maroon)) +
  scale_color_manual(values = c(orange, maroon)) +
  labs(x = "", y = "Avg. predicted\npublication counts", title = "Predicted publication counts") +
  facet_wrap(~ type) +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        panel.spacing = unit(2, "lines"),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5))

# Combined figure

ols_ame_frac_samples / ols_pred_frac_samples
```
:::
{#fig3s1}

chunk: Figure 3—figure supplement 2.
:::
### Corresponding analysis with counterfactual sample.

Gender differences in publication productivity for at counterfactual sample. Black lines show average marginal effects of female author gender each year, and colored lines show predicted publication counts (both full and fractionalized) for male and female authors. Predicted differences and counts are based on the difference-in-differences estimates from [Figure 3—source data 1](#fig3sdata1) and [Figure 3—source data 2](#fig3sdata2) in the supplementary material. 99.9% confidence intervals based on clustered standard errors are shown.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20

# Average marginal effect on full publication counts

ame_linear_counter_full <- ggplot(data = ols_ame_counter, aes(x = as.numeric(time_factor), y = AME, group = 1)) +
  geom_vline(xintercept = 4.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = lower, ymax = upper), fill = "grey50", alpha = 1/4) +
  geom_line(size = 1.5)+
  geom_point(shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #geom_text(aes(x = 3.75 , y = 0.05, label = "Lock-down")) +
  scale_x_continuous(label = c("2011", "2012", "2013", "2014", "2015")) +
  scale_y_continuous(limits = c(-0.6, 0.1), breaks = c(-0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0, 0.1)) +
  labs(x = "", y = "Avg. differences", title = "A. Differences in publication count") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        plot.title = element_text(hjust = 0))

# Average marginal effect on fractional publication counts
ame_linear_frac_counter <- ggplot(data = ols_ame_frac_counter, aes(x = as.numeric(time_factor), y = AME, group = 1)) +
  geom_vline(xintercept = 4.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = lower, ymax = upper), fill = "grey50", alpha = 1/4) +
  geom_line(size = 1.5)+
  geom_point(shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #geom_text(aes(x = 3.75 , y = 0.05, label = "Lock-down")) +
  scale_x_continuous(label = c("2011", "2012", "2013", "2014", "2015")) +
  scale_y_continuous(limits = c(-0.1, 0.05), breaks = c(-0.1, -0.05, 0, 0.05)) +
  labs(x = "", y = "Avg. differences", title = "B. Fractionalized counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        plot.title = element_text(hjust = 0))


# Average predicted publication counts for full counts

labels <- tibble(group = c("Men", "Women"),
                 x = c(-8.5, -8.5),
                 y = c(1.1, 0.4))

counterfactual_trend <- tibble(pub_year = c(-6, -5), # effect if female researchers had same drop as male
                               trend = as.numeric(c(ols_pred_counter[9, 1], ols_pred_counter[9, 1] + 
                                                      (ols_pred_counter[5, 1] - ols_pred_counter[4, 1])))
)

predicted_linear_counter <- ols_pred_counter %>% 
  mutate(gender_text = case_when(gender_num == 0 ~ "Men",
                                 gender_num == 1 ~ "Women")) %>% 
  ggplot(aes(x = as.numeric(time_factor), y = n_pubs)) +
  geom_vline(xintercept = -5.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = ymin, ymax = ymax, fill = gender_text), alpha = 1/2) +
  geom_line(aes(colour = gender_text), size = 1.5) +
  geom_line(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, linetype = "dashed", size = 1) +
  geom_point(data = counterfactual_trend, aes(x = pub_year, y = trend), color = orange, size = 2.5) +
  geom_point(aes(colour = gender_text, fill = gender_text), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #geom_text(aes(x = -0.1 , y = 2, label = "Lock-down")) +
  geom_text(data = labels, aes(x = x, y = y, label = group, colour = group), size = 5) +
  scale_x_continuous(label = c("2011", "2012", "2013", "2014", "2015")) +
  scale_y_continuous(limits = c(0, 2)) +
  scale_fill_manual(values = c(orange, maroon)) +
  scale_color_manual(values = c(orange, maroon)) +
  labs(x = "", y = "Avg. predicted\npublication counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        legend.position = "none")

# Average predicted publication counts for fractional counts

labels <- tibble(group = c("Men", "Women"),
                 x = c(-8.5, -8.5),
                 y = c(.25, 0.08))

counterfactual_trend_2 <- tibble(pub_year = c(-6, -5), # effect if female researchers had same drop as male
                                 trend = as.numeric(c(ols_pred_frac_counter[9, 1], ols_pred_frac_counter[9, 1] + 
                                                        (ols_pred_frac_counter[5, 1] - ols_pred_frac_counter[4, 1])))
)

predicted_linear_frac_counter <- ols_pred_frac_counter %>% 
  mutate(gender_text = case_when(gender_num == 0 ~ "Men",
                                 gender_num == 1 ~ "Women")) %>% 
  ggplot(aes(x = as.numeric(time_factor), y = n_pubs_frac)) +
  geom_vline(xintercept = -5.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = ymin, ymax = ymax, fill = gender_text), alpha = 1/4) +
  geom_line(aes(colour = gender_text), size = 1.5) +
  geom_line(data = counterfactual_trend_2, aes(x = pub_year, y = trend), color = orange, linetype = "dashed", size = 1) +
  geom_point(data = counterfactual_trend_2, aes(x = pub_year, y = trend), color = orange, size = 2.5) +
  geom_point(aes(colour = gender_text, fill = gender_text), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  geom_text(data = labels, aes(x = x, y = y, label = group, colour = group), size = 5) +
  scale_x_continuous(label = c("2011", "2012", "2013", "2014", "2015")) +
  scale_y_continuous(limits = c(0, .3)) +
  scale_fill_manual(values = c(orange, maroon)) +
  scale_color_manual(values = c(orange, maroon)) +
  labs(x = "", y = "Avg. predicted\npublication counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        legend.position = "none")

# Combined figure

(ame_linear_counter_full + ame_linear_frac_counter) / (predicted_linear_counter + predicted_linear_frac_counter)
```
:::
{#fig3s2}

## Productivity-dependant differences

As indicated in [Figure 4](#fig4) panel A, the effect of the pandemic on women’s and men’s publishing rates also varied considerably across different strata of the publication-productivity distribution. Indeed, a considerable share of the average marginal gender difference appeared to be attributable to differences occurring among the top-10% most prolific men and women authors. In contrast, changes in the average gender gap were marginal for authors below the 80th percentile of the publication distribution. This can clearly be seen in panel B, where the trends for men per quantile in 2019–2020 (solid, black dots) is projected unto the same trends for women (hollow dots). While the differences in trends below the 80th percentile are not visible in the figure, and the absolute differences are very small, the relative differences are noticeable. At the highest decile, the average difference increases from –1.35 (corresponding to 23% lower output for women) to –1.74 (31% lower output for women) from 2019–2020,, which is a relative change of 22.3%. Correspondingly the relative change is 25.8% in the 81st to 90th percentile and 25.9% in the 51st to 80th percentile.

chunk: Figure 4.
:::
### Stratified effects of the COVID-19 pandemic on the average gender gap in annual publishing rates.

Panel **A** shows the estimated average gender difference in publication rates by year. Panel **B** shows the predicted publishing rates for men and women authors. In each panel, scientists are divided into strata according to their total number of publications in the period 2016–2020. The difference between the thinner, dashed line with the black circle in 2020 and the thicker, dashed line with hollow circles in panel B specifies the average treatment effect for women. Point estimates are reported with 99.9% confidence bounds and robust standard errors clustered at the individual-author level. For information on how average marginal and predicted values are calculated, please refer to Materials and methods: Difference-in-differences model.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20

# Average marginal effects on full publication counts by quartile ----
  
qlabs <- c("1-50%", "51-80%", "81-90%", "91-100%")

ames_full <- ggplot(data = ames, aes(x = as.numeric(time_factor), y = AME, group = as.factor(q))) +
  geom_vline(xintercept = 4.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = lower, ymax = upper, fill = as.factor(q)), alpha = 1/4) +
  geom_line(size = 1.5, aes(color = as.factor(q)))+
  geom_point(aes(color = as.factor(q)), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(-2.1, 0.1), breaks = seq(-2.1, .1, by = .2)) +
  labs(x = "", y = "Avg. differences", title = "A. Differences in publication count") +
  scale_color_manual("Quantile", values = c(orange, maroon, olive, navy), labels = qlabs) +
  scale_fill_manual("Quantile", values = c(orange, maroon, olive, navy), labels = qlabs) +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        plot.title = element_text(hjust = 0),
        legend.position = "bottom")

# Average predicted full publication counts by quantile ----
  
preds <- preds %>% mutate(gender_text = case_when(gender_num == 0 ~ "Men",
                                                    gender_num == 1 ~ "Women"))

qs = unique(preds$q)
cf_trends <- tibble(pub_year = rep(c(-1, 0), length(qs)), q = rep(qs, each = 2), trend = 0, gender_text = "Women")

for (i in 1:length(qs)) {
  qi <- qs[i]
  x <- preds %>% filter(q == qi & gender_num == 0)
  y <- preds %>% filter(q == qi & gender_num == 1)
  cf_trends$trend[cf_trends$pub_year == -1 & cf_trends$q == qi] <- y$n_pubs[y$time_factor == -1]
  cf_trends$trend[cf_trends$pub_year == 0 & cf_trends$q == qi] <- y$n_pubs[y$time_factor == -1] - (x$n_pubs[x$time_factor == -1] - x$n_pubs[x$time_factor == 0])
}

preds_full <- preds %>% 
  ggplot(aes(x = as.numeric(time_factor), y = n_pubs, group = interaction(gender_text, q))) +
  geom_vline(xintercept = -0.5, linetype = "dotted", size = 1) +
  geom_hline(yintercept = 0, size = 1) +
  geom_ribbon(aes(ymin = ymin, ymax = ymax, fill = as.factor(q)), alpha = 1/4) +
  geom_line(aes(colour = as.factor(q), lty = gender_text), size = 1.5) + 
  geom_line(data = cf_trends, aes(x = pub_year, y = trend, color = as.factor(q)), linetype = "dashed", size = 1, alpha = .5) +
  geom_point(data = cf_trends, aes(x = pub_year, y = trend, color = as.factor(q)), size = 2.5, stroke = 2) +
  geom_point(aes(colour = as.factor(q)), shape = 21, fill = "white", size = 2.5, stroke = 2) +
  scale_x_continuous(label = c("2016", "2017", "2018", "2019", "2020")) +
  scale_y_continuous(limits = c(0, 6)) +
  scale_fill_manual("Quantile", values = c(orange, maroon, olive, navy), guide = "none") +
  scale_color_manual("Quantile", values = c(orange, maroon, olive, navy), guide = "none") +
  scale_linetype_discrete("Gender") +
  labs(x = "", y = "Avg. predicted\npublication counts", title = "B. Predicted publication counts") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        legend.position = "bottom")

# Combined figure

ames_full + preds_full
```
:::
{#fig4}

## Country-level differences

The estimated change in the magnitude of the gender gap also varied across countries ([Figure 5](#fig5)), with the smallest changes observed in Denmark, Australia, Pakistan and Belgium, and the largest increases found in Russia, Italy, Austria and Iran. The horizontal bar diagram to the right in [Figure 5](#fig5) shows that the vast majority of scientists are from the USA. This means that the average treatment effect on the treated ($ATT$) also gravitates towards the effect observed for the US population. Surprisingly, the estimated effects at the country-level were only weakly and inconsistently correlated with the severity of COVID-19 restrictions ([Figure 5—figure supplement 1](#fig5s1) and [Figure 5—figure supplement 2](#fig5s2)).

chunk: Figure 5.
:::
### Gender differences in full publication productivity by country, 2019 vs 2020.

The hollow circles show the gender differences per country in full publications counts in 2020 relative to 2019, with error bars showing the 99% confidence intervals based on robust clustered standard errors. Countries are ranked by the estimated gender difference. The horizontal histogram shows the distribution of authors from each country, showing that the vast majority are from the USA. We only list the first 30 countries by number of authors, comprising 90% of authors in our sample. The orange and green lines and bands show the overall treatment effect on the sample and the counterfactual sample. (ATT is the Average Treatment effect on the Treated).

OLS linear regression of counterfactual sample, with full count as dependent variable.OLS linear regression of counterfactual sample, with fractional count as dependent variable.Coefficients and standard errors relative to 2019 for the 30 countries with most authors in the dataset.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20

load("data/fig5_data.Rdata")

# Reference values from overall analysis
full_effect <- -0.0933
full_se <- 0.0055

full_counterfactual <- -0.0501
full_counterfactual_se <- 0.0064

t_value <- 2.576 # 99 % (for 99.9 % insert 3.291)

# Coefficients plot

ols_coef_full_country <- ols_model_full_country %>% 
  dplyr::filter(year == 2020) %>% 
  mutate(country = str_to_title(country)) %>% 
  ggplot(aes(y = reorder(country, coef), x = coef))+
  geom_vline(xintercept = full_effect, color = orange, size = 1) +
  geom_vline(xintercept = full_counterfactual, color = olive, size = 1) +
  geom_rect(aes(
    xmin = full_effect-(full_se*t_value), xmax = full_effect+(full_se*t_value), ymin = -Inf, ymax = Inf), 
    fill = orange, alpha = 0.0125)+
  geom_rect(aes(
    xmin = full_counterfactual-(full_counterfactual_se*t_value), xmax = full_counterfactual+(full_counterfactual_se*t_value), ymin = -Inf, ymax = Inf), 
    fill = olive, alpha = 0.0125)+
  geom_vline(xintercept = 0, size = 1, linetype = "dotted") +
  geom_linerange(aes(xmin = ll, xmax = ul), size = 1)+
  geom_point(shape = 21, fill = "white", size = 2, stroke = 1.5) +
  annotate(geom = "segment", x = full_effect-(full_se*t_value), xend = -0.275, y = 29.5, yend = 29.5, color = orange,
           arrow = arrow(length = unit(2, "mm"), type = "closed")) +
  annotate(geom = "segment", x = full_counterfactual+(full_counterfactual_se*t_value), xend = 0.1, y = 3, yend = 3, color = olive,
           arrow = arrow(length = unit(2, "mm"), type = "closed")) +
  geom_label(aes(x = -0.275, y = 29.5, label = "Overall ATT"), color = orange) +
  geom_label(aes(x = 0.115, y = 3, label = "Counterfactual\nATT"), color = olive) +
  scale_x_continuous(limits = c(-0.4, 0.2)) +
  labs(y = "", x = expression("Gender" %*% "2020")) +
  theme_ebm_grid() +
  theme(axis.line.x = element_blank(),
        legend.direction = "horizontal",
        legend.position = c(0.8, 0.9),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        plot.margin = margin(r = 0, unit = "cm"))

# Histogram

n_country_hist <- ols_model_full_country %>% 
  filter(year == 2020) %>% 
  mutate(country = str_to_title(country)) %>% 
  ggplot(aes(y = reorder(country, coef), x = n/1000)) +
  geom_col() +
  scale_x_continuous(limits = c(0, 110), breaks = c(0, 50, 110), labels = c("0", "50k", "110k"), expand = c(0,0)) +
  labs(x = "Authors", y = "") +
  theme_ebm_bar()+
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.major.x = element_line(colour = "grey85", linetype = "dotted"),
        plot.title = element_text(hjust = 0.5),
        plot.margin = margin(0.05, 1, 0.05, 0, unit = "cm"))

# Combined figure

ols_coef_full_country + n_country_hist + plot_layout(widths = c(3,1))
```
:::
{#fig5}

chunk: Figure 5—figure supplement 1.
:::
### Lockdown severity, summed indicators.

Gender differences in full-count publication productivity, 2019 vs 2020, across four different lockdown severity indicators. All indicators are summed values as stipulated in [Equation 3](#equ3). 99% confidence intervals based on clustered standard errors are shown.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20
load("data/figS4_data.Rdata")

# Full counts

ols_full_country_lockdown_sum <- ols_model_full_country %>% 
  filter(year == 2020) %>% 
  dplyr::select(-ends_with("_grp")) %>% 
  pivot_longer(c1_sum:c6_count_3, names_to = "lockdown_indicators", values_to = "severity") %>%
  separate(lockdown_indicators, c("indicator", "type")) %>% 
  mutate(indicator = case_when(indicator == "c1" ~ "School lockdowns",
                               indicator == "c2" ~ "Workplace lockdowns",
                               indicator == "c6" ~ "Stay at home requirements",
                               indicator == "stringency" ~ "Stringency index"),
         type = str_to_title(type)) %>% 
  filter(type == "Sum") %>% 
  ggplot(aes(x = severity, y = coef)) +
  geom_hline(yintercept = 0, size = 1, linetype = "dotted") +
  geom_linerange(aes(ymin = ll, ymax = ul), size = 1) +
  geom_point(shape = 21, fill = "white", size = 2, stroke = 1.5) +
  labs(x = "", y = "", title = "Full counts") +
  facet_wrap(~ indicator, scales = "free_x", ncol = 1) +
  theme_ebm_grid() +
  theme(panel.grid.major.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        strip.text = element_text(face = "bold"))

ols_frac_country_lockdown_sum <- ols_model_frac_country %>% 
  filter(year == 2020) %>% 
  dplyr::select(-ends_with("_grp")) %>% 
  pivot_longer(c1_sum:c6_count_3, names_to = "lockdown_indicators", values_to = "severity") %>%
  separate(lockdown_indicators, c("indicator", "type")) %>% 
  mutate(indicator = case_when(indicator == "c1" ~ "School lockdowns",
                               indicator == "c2" ~ "Workplace lockdowns",
                               indicator == "c6" ~ "Stay at home requirements",
                               indicator == "stringency" ~ "Stringency index"),
         type = str_to_title(type)) %>% 
  filter(type == "Sum") %>% 
  ggplot(aes(x = severity, y = coef)) +
  geom_hline(yintercept = 0, size = 1, linetype = "dotted") +
  geom_linerange(aes(ymin = ll, ymax = ul), size = 1) +
  geom_point(shape = 21, fill = "white", size = 2, stroke = 1.5) +
  scale_y_continuous(position = "right") +
  labs(x = "Severity of lockdown indicator", y = expression("Gender" %*% "2020"), title = "Fractionalized counts") +
  facet_wrap(~ indicator, scales = "free_x", ncol = 1) +
  theme_ebm_grid() +
  theme(panel.grid.major.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        strip.text = element_text(face = "bold"))

(ols_full_country_lockdown_sum + labs(x = "Severity of lockdown indicator", y = expression("Gender" %*% "2020")) | ols_frac_country_lockdown_sum)
```
:::
{#fig5s1}

chunk: Figure 5—figure supplement 2.
:::
### Lockdown severity, maximum indicators.

Gender differences in full-count publication productivity, 2019 vs 2020, across three different lockdown severity indicators. All indicators are counts of the maximum values as stipulated in [Equation 4](#equ4). 99% confidence intervals based on clustered standard errors are shown.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20

ols_full_country_lockdown_count <- ols_model_full_country %>% 
  filter(year == 2020) %>% 
  dplyr::select(-ends_with("_grp")) %>% 
  pivot_longer(c1_sum:c6_count_3, names_to = "lockdown_indicators", values_to = "severity") %>%
  separate(lockdown_indicators, c("indicator", "type")) %>% 
  mutate(indicator = case_when(indicator == "c1" ~ "School lockdowns",
                               indicator == "c2" ~ "Workplace lockdowns",
                               indicator == "c6" ~ "Stay at home requirements",
                               indicator == "stringency" ~ "Stringency index"),
         type = str_to_title(type)) %>% 
  filter(type == "Count") %>% 
  ggplot(aes(x = severity, y = coef)) +
  geom_hline(yintercept = 0, size = 1, linetype = "dotted") +
  geom_linerange(aes(ymin = ll, ymax = ul), size = 1) +
  geom_point(shape = 21, fill = "white", size = 2, stroke = 1.5) +
  labs(x = "", y = "", title = "Full counts") +
  facet_wrap(~ indicator, scales = "free_x", ncol = 1) +
  theme_ebm_grid() +
  theme(panel.grid.major.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        strip.text = element_text(face = "bold"))

ols_frac_country_lockdown_count <- ols_model_frac_country %>% 
  filter(year == 2020) %>% 
  dplyr::select(-ends_with("_grp")) %>% 
  pivot_longer(c1_sum:c6_count_3, names_to = "lockdown_indicators", values_to = "severity") %>%
  separate(lockdown_indicators, c("indicator", "type")) %>% 
  mutate(indicator = case_when(indicator == "c1" ~ "School lockdowns",
                               indicator == "c2" ~ "Workplace lockdowns",
                               indicator == "c6" ~ "Stay at home requirements",
                               indicator == "stringency" ~ "Stringency index"),
         type = str_to_title(type)) %>% 
  filter(type == "Count") %>% 
  ggplot(aes(x = severity, y = coef)) +
  geom_hline(yintercept = 0, size = 1, linetype = "dotted") +
  geom_linerange(aes(ymin = ll, ymax = ul), size = 1) +
  geom_point(shape = 21, fill = "white", size = 2, stroke = 1.5) +
  scale_y_continuous(position = "right") +
  labs(x = "", y = "", title = "Fractionalized counts") +
  facet_wrap(~ indicator, scales = "free_x", ncol = 1) +
  theme_ebm_grid() +
  theme(panel.grid.major.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        strip.text = element_text(face = "bold"))

(ols_full_country_lockdown_count + labs(x = "Severity of lockdown indicator", y = expression("Gender" %*% "2020")) | ols_frac_country_lockdown_count)
```
:::
{#fig5s2}

## Discipline-level differences

As a final step in the analysis, we disaggregated results by discipline. As shown in [Figure 6](#fig6) panel A, the widening gender gap was persistent across all four disciplines but with markedly larger effects observed for clinical medicine (Average marginal gender difference = −0.117, CI: –0.138––0.095) and biology (Average marginal gender difference = −0.089, CI: –0.117––0.063) compared to basic medicine (Average marginal gender difference = 0.058, CI: –0.093––0.022) and chemistry (Average marginal gender difference = 0.062, CI: –0.100––0.023). [Figure 6](#fig6) panel B specifies the representation of authors according to their position in the publication-productivity distribution, across the four disciplines. As shown in the figure, we observe an over-representation of highly productive authors in clinical medicine implying that the large average marginal gender difference effect observed for this discipline may partially be driven by a higher proportion of prolific scientists.

chunk: Figure 6.
:::
### 2020 gender differences in full publications counts relative to 2019, across the four disciplines comprising in our sample.

Difference-in-differences estimate from [Figure 6—source data 1](#fig6sdata1). 99% confidence intervals based on clustered standard errors are shown. Histograms show the distribution of authors who mainly publish within a given discipline, and orange and green lines and bands show the overall treatment effect on the sample and the counterfactual sample from [Figure 2—source data 1](#fig2sdata1) and [Figure 5—source data 1](#fig5sdata1). Panel **B** shows the distribution of authors per discipline in deciles of total publications over the time period. Coefficients and standard errors relative to 2019 for the four disciplines.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20

load("data/fig6_data.Rdata")

# Reference values from overall analysis
full_effect <- -0.0933
full_se <- 0.0055

full_counterfactual <- -0.0501
full_counterfactual_se <- 0.0064

t_value <- 2.576 # 99 % (for 99.9 % insert 3.291)

# Discipline analysis

ols_coef_full_discipline <- ols_model_full_discipline %>% 
  filter(year == 2020) %>% 
  ggplot(aes(x = coef, y = reorder(oecd, coef))) +
  geom_vline(xintercept = full_effect, color = orange, size = 1) +
  geom_vline(xintercept = full_counterfactual, color = olive, size = 1) +
  geom_rect(aes(
    xmin = full_effect-(full_se*t_value), xmax = full_effect+(full_se*t_value), ymin = -Inf, ymax = Inf), 
    fill = orange, alpha = 0.05)+
  geom_rect(aes(
    xmin = full_counterfactual-(full_counterfactual_se*t_value), 
    xmax = full_counterfactual+(full_counterfactual_se*t_value), ymin = -Inf, ymax = Inf), 
    fill = olive, alpha = 0.05)+
  geom_vline(xintercept = 0, size = 1, linetype = "dotted") +
  geom_linerange(aes(xmin = ll, xmax = ul), size = 1)+
  geom_point(shape = 21, fill = "white", size = 2, stroke = 1.5) +
  annotate(geom = "segment", x = full_effect-(full_se*t_value), xend = -0.16, y = 3.5, yend = 3.5, color = orange,
           arrow = arrow(length = unit(2, "mm"), type = "closed")) +
  annotate(geom = "segment", x = full_counterfactual+(full_counterfactual_se*t_value), xend = 0.05, y = 1.5, yend = 1.5, color = olive,
           arrow = arrow(length = unit(2, "mm"), type = "closed")) +
  geom_label(aes(x = -0.16, y = 3.5, label = "Overall ATT"), color = orange) +
  geom_label(aes(x = 0.05, y = 1.5, label = "Counterfactual\nATT"), color = olive) +
  scale_x_continuous(limits = c(-0.2, 0.1)) +
  labs(y = "", x = expression("Gender" %*% "2020"), title = "A. Discipline gap") +
  theme_ebm_grid() +
  theme(axis.line.x = element_blank(),
        legend.direction = "horizontal",
        legend.position = c(0.8, 0.9),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        plot.margin = margin(r = 0, unit = "cm"))

# Histogram

n_discipline_hist <- ols_model_full_discipline %>% 
  filter(year == 2020) %>% 
  ggplot(aes(y = reorder(oecd, coef), x = n/1000)) +
  geom_col() +
  scale_x_continuous(limits = c(0, 220), breaks = c(0, 100, 220), labels = c("0", "100k", "220k"), expand = c(0,0)) +
  labs(x = "Authors", y = "") +
  theme_ebm_bar()+
  theme(axis.text.y = element_blank(),
        axis.ticks.y = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.grid.major.x = element_line(colour = "grey85", linetype = "dotted"),
        plot.title = element_text(hjust = 0.5),
        plot.margin = margin(0.05, 1, 0.05, 0, unit = "cm"))

ols_coef_full_discipline_hist <- ols_coef_full_discipline + n_discipline_hist + plot_layout(widths = c(3,1))

# People per field and decile

p1 <- d_author %>% 
  group_by(oecd) %>%
  summarize(n = n())

pd <- d_author %>%
  group_by(oecd, q) %>%
  summarize(nd = n())

pd$ratio <- 0
pd$ratio[pd$oecd == "Basic medicine"] <- pd$nd[pd$oecd == "Basic medicine"] / p1$n[p1$oecd == "Basic medicine"] * 10
pd$ratio[pd$oecd == "Clinical medicine"] <- pd$nd[pd$oecd == "Clinical medicine"] / p1$n[p1$oecd == "Clinical medicine"] * 10
pd$ratio[pd$oecd == "Biology"] <- pd$nd[pd$oecd == "Biology"] / p1$n[p1$oecd == "Biology"] * 10
pd$ratio[pd$oecd == "Chemistry"] <- pd$nd[pd$oecd == "Chemistry"] / p1$n[p1$oecd == "Chemistry"] * 10

a_disc <- pd %>%
  ggplot(aes(x = as.numeric(q), y = nd, group = oecd, fill = oecd)) +
  geom_bar(position = "stack", stat = "identity") + 
  scale_x_continuous("Decile, number of total publications", breaks = 1:10) +
  scale_y_continuous("Number of authors per discipline", labels = label_number(suffix = "K", scale = 1e-3)) +
  scale_fill_manual("Discipline", values = c(orange, maroon, olive, "black")) +
  labs(title = "B. Author distribution per discipline") +
  theme_ebm_bar()+ 
  theme(
    legend.position = "bottom",
    plot.title = element_text(hjust = 0.5)
  )

# Combined figure

layout <- "
AAAAB
AAAAB
AAAAB
CCCCC
CCCCC
DDDDD
"

ols_coef_full_discipline / n_discipline_hist / a_disc + guide_area() + 
  plot_layout(design = layout, guides = "collect") & 
  theme(legend.position = "bottom",
        legend.justification = c("left","bottom"),
        legend.box.just = "right",
        legend.box.margin = margin(0,0,0,0))
```
:::
{#fig6}

## Robustness checks

We conducted two ([Figure 7](#fig7)) placebo tests, simulating a placebo pandemic incident between 2017–2018 and 2018–2019. shows the difference-in-differences estimate for both full and fractionalized publication counts. In both cases, the estimates are very small in magnitude (ranging from 7%–17% of our 2020 estimate, ${\delta }_{t=0}$), and only statistically significant for the 2017–2018, full count, estimate at the 99% level (the 2017–2018 estimate is significant for the fractionalized count at a 95% level). Taken together, there does not appear to be a substantial difference in publication counts in the immediate years prior to the onset of the pandemic.

chunk: Figure 7.
:::
### Test against hypothetical placebo pandemic in 2018/2019 (**A**) and changes in women as first authors (**B**).

(**A**) Difference in differences of publication productivity for placebo tests. Points show the difference in publication productivity for women relative to men for two placebo periods, using both full publication counts and fractionalized counts. Estimates are based on [Figure 7—source data 1](#fig7sdata1) and [Figure 7—source data 2](#fig7sdata2). Errorbars are 99% confidence intervals, with accompanying p-values based on clustered standard errors. (**B**) Ratio of women’s first author share to women’s share of all authorships. Each line shows the share of women who occupy the first author position divided by women’s share of all authorships by year for each of the four disciplines. A ratio > 1 shows a greater share of women first authors relative to all women’s authorships. Authorships counts are made for a larger sample than used in the main analysis, comprising all authorships registered in the Web of Science for each discipline and year. OLS linear regression of full and fractional count as dependent variable, placebo test of 2017 vs 2018.Linear regression with author and year fixed effects. Standard errors in parentheses are HC1 and clustered at the author level.OLS linear regression of full and fractional count as dependent variable, placebo test of 2018 vs 2019.Linear regression with author and year fixed effects. Standard errors in parentheses are HC1 and clustered at the author level.

```{r message=FALSE, warning=FALSE}
#' @width 28
#' @height 20

load("data/fig7_data.Rdata")
load("data/gender_field_pos.Rdata")

# Figure of placebo DiD-estimates

figure_placebo <- placebo_estimates %>% 
  ggplot(aes(y = est, x = placebo)) +
  geom_hline(yintercept = 0, size = 1) +
  geom_linerange(aes(ymin = ll, ymax = ul), size = 1.5, position = position_dodge(width = 1), color = olive) +
  geom_point(shape = 21, fill = "white", size = 3, stroke = 2, position = position_dodge(width = 1), color = olive) +
  geom_text(aes(label = paste0("p = ", round(p, 3))), nudge_x = 0.3) +
  scale_y_continuous(n.breaks = 5) +
  facet_wrap(~ outcome_f, scales = "free_y") + 
  labs(x = "", y = expression(delta[placebo]), title = "A. Placebo tests") + 
  theme_ebm_bar()+
  theme(strip.text = element_text(face = "bold", size = 14),
        plot.title = element_text(hjust = 0.5),
        plot.margin = margin(t = 0.5, r = 0.5, b = 0.5, l = 0.5, unit = "cm"),
        axis.line.x = element_blank())



# Figure of development in first authorship

figure_ratio <- author_pos %>% 
  pivot_wider(names_from = "first_only", values_from = "n") %>% # N publications by author position
  rename(n_first = `1`, n_all = `0`) %>% 
  pivot_wider(names_from = "gender", values_from = c("n_first", "n_all")) %>% # N publications by gender
  mutate(n_total = n_all_F + n_all_M,
         n_total_first = n_first_F + n_first_M,
         ratio_first_F = n_first_F/n_total_first,
         ratio_all_F = n_all_F/n_total,
         ratio_of_ratio = ratio_first_F/ratio_all_F) %>% # Get total N publications, N total first-authored publications, and ratios of women's publications
  mutate(field = case_when(oecd_minor_code == "1.04" ~ "Chemistry",
                           oecd_minor_code == "1.06" ~ "Biology",
                           oecd_minor_code == "3.01" ~ "Basic medicine",
                           oecd_minor_code == "3.02" ~ "Clinical medicine")
  ) %>% 
  # Plot from here
  ggplot(aes(x = pub_year, y = ratio_of_ratio)) + 
  geom_vline(xintercept = 2019.5, linetype = "dotted", size = 1) +
  #geom_hline(yintercept = 1, size = 1) +
  geom_line(aes(colour = field), size = 1.5) +
  geom_point(shape = 21, fill = "white", size = 2.5, stroke = 2) +
  #scale_y_continuous(limits = c(1.125, 1.25)) +
  scale_color_manual("Discipline", values = c(orange, maroon, olive, navy)) +
  labs(x = "", y = expression(p[women]^first / p[women]^all), title = "B. Share of women first authors") +
  theme_ebm_bar()+
  theme(axis.line.x = element_blank(),
        axis.ticks.y = element_blank(),
        plot.title = element_text(hjust = 0.5),
        legend.position = "bottom")

# Combined figure

figure_placebo / figure_ratio
```
:::
{#fig7}

We also check whether there are changes in the position in the author byline of women authors (see [Figure 7B](#fig7)). We first observe, that the share of women first authors is higher than expected, considering the share of women in total. Some variation occurs over time, but there are no changes from 2019–2020 which could indicate a general shift in women appearing less often as first authors than before the pandemic.

# Discussion

In this paper, we estimated the differential impact of COVID-19 on the annual publication rates of women and men in 2020 compared to 2019. Using individual-level panel data on a global sample of 431,207 authors, we observed small but consistent average increases in the gap between women’s and men’s annual publishing rates. This finding is consistent with extant research suggesting amplified gender disparities in manuscript submissions, first and last authorships, and self-reported research activities during COVID-19. However, unlike prior studies, we find that the gendered effects of COVID-19 are salient for early-career-scientists with four years of publication experience as well as for mid-career scientists with ten years of publication experience. While the numerical increase in the gender gap is largest for mid-career scientists, the relative change in the gender gap is biggest for early-career scientists. Moreover, we add to existing evidence by showing that the increase in the gender gap (in absolute terms) was most pronounced among highly productive authors and scientists working in clinical medicine and biology. Lastly, the widening gender gap appears to represent a genuine decline in publication productivity and not just a shift in author roles, as women continue to first author publications at similar rates as in prior years ([Figure 7](#fig7)).

Despite clear country variations in the observed effects, we found negligible and inconsistent associations between local COVID-19 restrictions and estimated changes in the productivity gender gap. Further, the ordering of countries in [Figure 5](#fig5) does not seem to suggest that the gender-differentiated changes in productivity rates vary systematically according to a country’s level of gender equality, welfare model, or infection rate.

Taken together, these results indicate that the publication productivity of already prolific women scientists have been affected the most by the pandemic. Those designing interventions to promote equity in academic science and medicine should strive to understand the reasons why highly prolific men appeared able to maintain their annual publication rates while highly prolific women were not. Prior research suggests that it is possible that men with the highest levels of productivity may have been more likely to have been rewarded with access to additional workplace supports, such as endowed professorships, in recognition of their achievements ([@bib18]). If so, this might have served as a cushion against the impact of the pandemic on those individuals. Moreover, if institutions prioritized protecting a few "superstar" researchers from teaching or clinical demands without clear processes for identifying which individuals received preferential treatment, the vast literature on unconscious bias suggests that such efforts might preferentially have protected outstanding men as compared to similarly outstanding women ([@bib38]). Prior research also suggests that high-achieving women scientists may be more likely than their male peers to state that their partners’ careers take priority ([@bib33]). Indeed, it is possible that high-achieving men scientists’ partners may be particularly likely to be willing to make sacrifices in their own careers to take on additional domestic labor to allow continuation of their extraordinary partners’ work. If partners of extraordinarily productive women scientists are less willing to do so, and if this difference is even more marked than any differences that may exist when a scientist is less highly productive, this could also serve as a mechanism to drive the differences observed. Further research is necessary to investigate these and other possibilities.

The amplified effect in clinical medicine may be due to the dual research and clinical roles taken on by scientists in this discipline. Early research suggested that initial funding for COVID-19 related research was biased toward applications from men ([@bib50]), supporting a hypothesis that women spent disproportionally more time on clinical work or other demands around the time of the outbreak. However, further research is required to provide conclusive evidence on this question. The consequences of a systematically biased change in the work priorities for men and women in particularly clinical medicine can potentially reach far beyond the individual careers of those women affected by it. Research suggests a positive association between women’s participation as leading authors in medical research and a study’s likelihood of including sex and gender as analytical variables ([@bib39]). The omission of gender and sex analysis has been widespread in COVID-19-related clinical trials ([@bib5]), despite early evidence of sex-differences in the prognosis and outcome of the disease.

The widening gender-gap in publishing may be a detectable symptom of larger setbacks on issues of gender equity in science ([@bib27]). Indeed, recent research also shows widening gender disparities in research project initiation ([@bib17]) and clinical-trial leadership ([@bib7]).

Our study demonstrates the importance of reinforcing institutional commitments to gender equity through policies that support the inclusion and retention of women researchers ([@bib2]; [@bib16]; [@bib27]; [@bib37]). While our study focuses on gender, other marginalized groups are likely to suffer from similar set-backs, potentially to an even higher degree. These groups are generally under-studied in the the literature on productivity gaps, as they are much more difficult to identify quantitatively. Further research, with reliable data on especially ethnicity, and with an inter-sectional perspective is needed.

Data on individual publication rates gives us a better estimate of the effects of the pandemic on researcher productivity than most previously published analyses focusing on publication-level effects. Despite this, the data do not allow us to disentangle how much of the widening gender gap is due to attrition. If the relative share of women scientists opting out of an academic career is higher in 2020 compared to 2019, this may inflate the observed change in productivity. Future research should examine the potential changes in women’s and men’s attrition rates in closer detail. Further, the counter-factual analysis presented in [Figure 3—figure supplement 2](#fig3s2) suggests a consistent increase in the size of gender productivity gap over time with a marginal annual change in the gender difference from year four to five amounting to 53% of the treatment effect observed in our main analysis. The estimated change from a 17% lower output for women than men in 2019 compared to 24% percent lower output for women than men in 2020 should thus be interpreted with some caution. However, both mechanisms - lower publication productivity and attrition - result in lower total publication outputs for women and lead to enlarged gender disparities. While we can not currently estimate the relationship between the two mechanisms, the conclusions above remain the same.

Our study design has four limitations. First, our analysis focused on annual publishing rates, which may obscure some of the potential effects of e.g. school closures on the immediate publishing rates. A more granular analysis of monthly publishing rates may reveal a more direct correlation between lockdowns and decreased publishing rates. However, information on when something is published is not available on a monthly basis for a large proportion of articles, and information on submission and review dates are even harder to obtain, often completely missing. Further, many of the delays occurring in the publishing process are out of the hand of authors and thus unrelated to the lockdown effect that they may be experiencing. By looking at annual data, we can estimate a more reliable effect overall. We strongly encourage publishers to make available transparent, open machine- and human-accessible information on which date a manuscript was received, reviewed, revised, accepted and published. Similarly, the weak relationship between country-level gender gaps and the severity of lockdown policies could be due to aggregation. Using survey data on self-reported time-use, [@bib12] show that e.g. the fraction of days with at least partial primary school closures negatively affected time loss for women researchers relative to men in the period Feb. 16 - July 31, 2020. To compare our yearly publication data with lockdown severity, we aggregated day-to-day data on school closures, workplace closures, stay at home requirements, and overall lockdown severity across the entire year of 2020.

Second, the author-disambiguation approach used to establish individual-level panel data unavoidably introduces some level of uncertainty into our analysis, and errors are more likely to occur for individuals with East Asian names ([@bib40]) (see Materials and Methods). The country-specific evidence for China and South-Korea ([Figure 5](#fig5)) should thus also be interpreted with caution.

Third, the gender-assignment algorithm used in this study did not infer the gender of 20% of the author sample. This introduces potential sampling bias into our analysis. Moreover, the algorithm reduces author gender to a binary category (woman or man), but not all individuals identify as women or men. Despite this clear limitation, we find the algorithm useful in quantifying COVID-19-related disparities on a large scale ([@bib14]).

Fourth, academic publishing is a slow endeavor, and article submissions may undergo many rounds of revisions before they are published ([@bib23]). This introduces two types of potential bias into our analysis: (a) some of the articles published in 2020 are based on research conducted in 2019; and (b) some of the research conducted in 2020 will not appear in print before 2021, or later. Thus, in the coming years, scientists should continue to monitor disparities in women’s and men’s publishing rates.

In science, even small negative kicks or setbacks may add up over time and become cumulative disadvantages ([@bib48]; [@bib8]). We observe a decreased growth in publications for all but the most productive men, and especially early-career researchers. This has the potential to reinforce disparities in an already heavily skewed system, if not given special attention, especially with regard to women. The widening gender gap in publishing observed in this study should thus be taken seriously by universities and funding agencies and factored into policies that allocate resources and support, as well as those that determine advancement and compensation, in order to mitigate inequities resulting from the unequal impact of the pandemic and its associated disruptions. Such inequities are deeply troubling both because they demonstrate how morally arbitrary characteristics like gender affect the opportunity to succeed in science and because they hinder the inclusion of diverse perspectives necessary to optimally advance scientific inquiry itself.

# Materials and methods

Data on authors and their publications. Publication data were retrieved from the Web of Science (WoS) in-house implementation at CWTS, Leiden University. This version of the WoS has linked tables between authors, their publications and information on the probable gender of authors.

The CWTS WoS includes a high-quality disambiguated table of authors and links to their publications. This list is produced through an algorithmic identification of publication clusters, using author, publication, source and citation data ([@bib6]; [@bib13]). This algorithm greatly improves the likelihood of an author profile containing the correct links to a scientist’s publications, without including those of another author with the same name, and also including their own publications published under variations of their name. This algorithm so far has the highest precision and recall for this task ([@bib47]).

Author gender was inferred using a combination of Gender-API (<https://gender-api.com/>) and genderize (<https://genderize.io/>), in order to find the most likely gender of an author using their first name and country. The inferred gender is only applied in cases with >90% confidence, meaning gender ambiguous names, or names with very few observations for a country, are not included. This leads to an exclusion of 20% of all authors, with a majority of those from China and South Korea, as first names in these countries tend to be less gendered than for most other countries.

Disciplines were inferred from the journal in which articles were published, using the translation table (<http://help.prod-incites.com/inCites2Live/filterValuesGroup/researchAreaSchema/oecdCategoryScheme.html>) between WoS Subject Categories and the OECD Fields of Science from the Frascati Manual (OECD Working [@bib41]). For each author, we summed the weighted major scientific fields and assigned the most frequent as their main discipline.

We queried the WoS for all authors with their first publication in either 2009 or 2010 (mid-career researchers) or 2016 or 2017 (early-career researchers). We excluded authors with fewer than three publications in total, and further limited the sample to authors with at least one publication in 2018 or 2019. The last step was done to create a sample of actively publishing scientists. We assigned main discipline codes to all authors and limited the sample to authors from _1.4 Chemical sciences_, _1.6 Biological sciences_, _3.1 Basic medicine_ and _3.2 Clinical medicine_. This sample consisted of 431,207 authors linked to 2,113,108 publications in the period 2016–2020. The counterfactual sample was constructed identically, but for authors with their first publication in 2011 or 2012, counting their publications until 2015. This sample included 276,793 authors linked to 1,060,330 publications.

## Difference-in-differences model

To estimate the differential impact of the COVID-19 pandemic on the gender gap in publication productivity, we leveraged a difference-in-differences strategy. Because of a persistent gender gap in the number of publications over time, we used the yearly data on journal article publications prior to 2020 as baselines for estimating how the pandemic impacted the scholarly productivity of men and women differently. Although, not a randomized treatment, we treated the yearly gender difference in publication numbers (for 2016, 2017, 2018, and 2020) relative to the difference in 2019 as our key estimand. To estimate the average treatment effect on the treated ($ATT$), the gender difference relative to the baseline 2019 difference, we specified the following regression model:

$$
{Y}_{it}={\alpha }_{i}+{\gamma }_{t}+\sum _{t=-4}^{4}{\delta }_{t}{\text{Gender}}_{i}\cdot {\text{Year}}_{t}+{ϵ}_{it}
$$

Where ${Y}_{it}$ denotes the number of published articles by individual $i$ in year $t$, ${\alpha }_{i}$ are the author fixed effects, ${\gamma }_{t}$ are the year fixed effects, and ${\delta }_{t}$ are a set of parameters with $t\in \{-4,-3,-2,0\}$ estimating the difference in publication numbers between men and women each year, relative to the difference in 2019 ($t=-1$), which we left out of the estimation. The indicator $t$ is here the year relative to 2020. The $ATT$ for a given year $k$ relative to 2019 is then:

$$
\begin{aligned} A T T_{t=k} &=E\left[Y_{\mathrm{women}}^{1} \mid t=k\right]-E\left[Y_{\mathrm{women}}^{0} \mid t=k\right] \\ &+\left[E\left[Y_{\mathrm{women}}^{0} \mid t=k\right]-E\left[Y_{\mathrm{women}}^{0} \mid t=-1\right]\right] \\ &-\left[E\left[Y_{\mathrm{men}}^{0} \mid t=k\right]-E\left[Y_{\mathrm{men}}^{0} \mid t=-1\right]\right] \end{aligned}
$$

When used in the analysis, predicted values are the average partial effects at specified combinations of gender and year. We calculate the linear predicted value based on the regression model for each unit of observation (person i at year t), and average over these units for each specified subset of units (e.g. women in 2019 or men in 2018). This provides average predicted publications counts for each group at each time. Estimated differences in publication counts are the average marginal effects for each year derived from the regression model. The marginal effects are the partial derivative with respect to gender for each unit of observation, and the estimated average differences are then the mean of the unit-specific derivatives at each year.

## Parallel trends and counterfactual samples

Valid identification of the differential impact of the COVID-19 pandemic on researchers of different genders relies on a strong assumption of parallel trends of publication outcomes in pre-pandemic years. I.e. identification of the average treatment effect on women essentially assumes that ${\left[E\left[Y_{\text {women }}^{0} \mid t=k\right]-E\left[Y_{\text {women }}^{0} \mid t=-1\right]\right]-\left[E\left[Y_{\text {men }}^{0} \mid t=k\right]-E\left[Y_{\text {men }}^{0} \mid t=-1\right]\right]=0 . \mathrm{A}}$. A large literature (e.g. [@bib21]; [@bib31]) has documented persistent gender gaps in publication productivity. Our dynamic difference-in-differences model confirms this. A consistent gap between men and women is present in all years prior to 2020 for our full sample ([Figure 2](#fig2)). This gap also tends to slightly increase over time, casting doubt on the assumption of similar publication trends for men and women scientists. [Figure 2—source data 1](#fig2sdata1) shows a statistically significant difference in the publication gender gap between 2016 and 2019, and 2017 and 2019. However, the difference is much smaller, and statistically non-significant, when comparing 2018 and 2019.

We also modeled the differential publications rates for a counterfactual sample of researchers, who started publishing (or who’s first publication was registered in the Web of Science database) in 2011, across the following five years. As shown in [Figure 3—figure supplement 1](#fig3s1), the gender gap in publication rates increased from almost parity in the first year to an average difference of 0.2 full publications five years after (0.05 fractionalized). Again, the gender gap increased with 1/20 of a full publication (full count: –0.05, 99% CI: \[–0.0665; –0.0337\], fractionalized count: –0.006, 99% CI: \[–0.0094; –0.0028\]) between four and five years after first publication, amounting to 53% of our $ATT$ from the full sample.

## Data on lockdown severity

To assess how the pandemic may entail different gender effects across countries and lockdown severity, we use data from the Oxford COVID-19 Government Response Tracker. We construct seven lockdown indicators at the country level by aggregating four measures of daily government COVID-policies across a whole year (from March 1st 2020 to December 31st 2020) in two ways. [Table 1](#table1) summarizes the seven indicators. We use four of the Oxford COVID-19 Government Response Tracker indicators ([@bib20]) related to the coordinated close-downs of schools (C1) or workplaces (C2), stay at home requirements (C6), and the combined policy stringency index. First, we sum the indicator value across the whole year to create a cumulative sum of restriction severity for all four indicators, such that a lockdown indicator ${L}_{k}$ is the summarized values across 305 days:

$$
{L}_{k}=\sum _{i=1}^{305}{I}_{i}
$$

table: Table 1.
:::
### Seven indicators of COVID-19 lockdown severity.

|                           | Sum indicator | Count of maximum values |
| ------------------------- | ------------- | ----------------------- |
| School lockdowns          | +             | +                       |
| Workplace lockdowns       | +             | +                       |
| Stay at home requirements | +             | +                       |
| Stringency index          | +             | -                       |
:::
{#table1}

Second, we count the number of days across the same period with the maximum indicator value for three indicators relating to school lockdowns, workplace lockdowns, and stay at home requirements. Each of these indicators can take the values 0, 1, 2, and three per day (where three indicates the most severe policy situation for the three indicators in question). For these three indicators we create a conditional sum across 305 days. We then let ${L}_{k}$ be the number of days an indicator ${I}_{1},\mathrm{\dots },{I}_{305}$ equals 3:

$$
{L}_{k}=\sum _{i=1}^{305}\left[{I}_{i}=3\right]
$$

Together, this gives us seven different indicators of lockdown severity at the national level. It is important to note that we use national-level policy indicators capturing only COVID-19 policy responses enacted at the country or federal level. In cases where sub-national policies supersede country-level restrictions, more or less severe policies are not reflected in the indicators.

## Heterogeneity in COVID-19 effects

To show the heterogeneity in possible COVID-19 induced treatment effects, we estimated our difference-in-differences model separately for each country, focusing on the 40 countries contributing 95% of all authors in our sample. We also investigated the degree to which this heterogeneity could be attributed to variations in the severity of policy restrictions across countries. Using the seven lockdown indicators described above, we compared country-level gender gaps with the measures of severity as shown in [Figure 5—figure supplement 1](#fig5s1) and [Figure 5—figure supplement 2](#fig5s2).