diff --git a/README.md b/README.md index fe5b59b..d531b91 100644 --- a/README.md +++ b/README.md @@ -1,16 +1,73 @@ -*Updated: 27 Sep 2020* +*Updated: 30 Sep 2020* Author: Marc Bevand -This repository contains code to: -* apply estimates of the age-stratified infection fatality ratio (IFR) of - COVID-19 to countries' population pyramids, to find their expected overall IFR -* calculate the age-stratified IFR from the Spanish ENE-COVID serosurvey +This project studies the age-stratified infection fatality ratio (IFR) of COVID-19: +* compare COVID-19 to seasonal influenza (flu) +* calculate the expected overall IFR based on countries' population pyramids +* calculate the age-stratified IFR of COVID-19 from the Spanish ENE-COVID serosurvey + +# Comparing COVID-19 to seasonal influenza + +![Infection Fatality Ratio of COVID-19 vs. Seasonal Influenza][covid_vs_flu.png] + +The above chart compares the IFR of COVID-19 to the IFR of seasonal influenza. We +find that COVID-19 is definitely significantly more fatal than seasonal influenza at all +ages above 30 years. The source code to create this chart is +[covid_vs_flu.py](covid_vs_flu.py). The COVID-19 IFR curves represent various +estimates: + +1. ENE-COVID Spanish serosurvey (calculated by `calc_ifr.py`, see next section) +1. [US CDC](https://web.archive.org/web/20200911222029/https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html) (table 1) +1. [Verity et al.: Estimates of the severity of coronavirus disease 2019: a model-based analysis](https://www.thelancet.com/journals/laninf/article/PIIS1473-3099%2820%2930243-7/fulltext) (table 1) +1. [Levin et al.: Assessing the age specificity of infection fatality rates for COVID-19: systematic review, meta-analysis, and public policy implications](https://www.medrxiv.org/content/10.1101/2020.07.23.20160895v5) (table 3) +1. [Perez-Saez et al.: Serology-informed estimates of SARS-CoV-2 infection fatality risk in Geneva, Switzerland](https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30584-3/fulltext) +1. [Poletti et al.: Age-specific SARS-CoV-2 infection fatality ratio and associated risk factors, Italy, February to April 2020](https://www.eurosurveillance.org/content/10.2807/1560-7917.ES.2020.25.31.2001383) (table 1, column "Any time") +1. [Picon et al.: Coronavirus Disease 2019 Population-based Prevalence, Risk Factors, Hospitalization, and Fatality Rates in Southern Brazil](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7493765/) (table 2) +1. [Gudbjartsson et al.: Humoral Immune Response to SARS-CoV-2 in Iceland](https://www.nejm.org/doi/full/10.1056/NEJMoa2026116), + specifically [Supplementary Appendix 1](https://www.nejm.org/doi/suppl/10.1056/NEJMoa2026116/suppl_file/nejmoa2026116_appendix_1.pdf) (table S7) +1. [PHAS - Public Health Agency of Sweden: The infection fatality rate of COVID-19 in Stockholm – Technical report](https://www.folkhalsomyndigheten.se/contentassets/53c0dc391be54f5d959ead9131edb771/infection-fatality-rate-covid-19-stockholm-technical-report.pdf) (table B.1) +1. [O’Driscoll et al.: Age-specific mortality and immunity patterns of SARS-CoV-2 infection in 45 countries](https://www.medrxiv.org/content/10.1101/2020.08.24.20180851v1) (table S4) +1. [Ward et al.: Antibody prevalence for SARS-CoV-2 in England following first peak of the pandemic: REACT2 study in 100,000 adults](https://www.medrxiv.org/content/10.1101/2020.08.12.20173690v2), + specifically [Supplementary Appendix](https://www.medrxiv.org/highwire/filestream/93745/field_highwire_adjunct_files/0/2020.08.12.20173690-1.docx) (table S2a, column "Based on confirmed COVID-19 deaths") +1. [Yang et al.: Estimating the infection fatality risk of COVID-19 in New York City during the spring 2020 pandemic wave](https://www.medrxiv.org/content/10.1101/2020.06.27.20141689v2) (table 1) +1. [Molenberghs et al.: Belgian Covid-19 Mortality, Excess Deaths, Number of Deaths per Million, and Infection Fatality Rates](https://www.medrxiv.org/content/10.1101/2020.06.20.20136234v1) (table 6) + +The seasonal influenza IFR curves represent data from the US CDC on multiple seasons of flu: + +1. [2018-2019 influenza burden](https://www.cdc.gov/flu/about/burden/2018-2019.html) +1. [2017-2018 influenza burden](https://www.cdc.gov/flu/about/burden/2017-2018.htm) +1. [2016-2017 influenza burden](https://www.cdc.gov/flu/about/burden/2016-2017.html) +1. [2015-2016 influenza burden](https://www.cdc.gov/flu/about/burden/2015-2016.html) +1. [2014-2015 influenza burden](https://www.cdc.gov/flu/about/burden/2014-2015.html) + +However, these CDC statistics (eg. table 1 in "2018-2019 influenza burden",) +only give the estimated number of symptomatic illnesses. We must account for +asymptomatic ones as well to calculate the IFR. + +In [Key Facts About Influenza +(Flu)](https://www.cdc.gov/flu/about/keyfacts.htm) the CDC implies 55-60% of +illnesses are symptomatic: + +> «on average, about 8% of the U.S. population gets sick from flu each season, +> with a range of between 3% and 11%, depending on the season. +> [...] +> The commonly cited 5% to 20% estimate was based on a study that examined both +> symptomatic and asymptomatic influenza illness, which means it also looked at +> people who may have had the flu but never knew it because they didn’t have +> any symptoms. The 3% to 11% range is an estimate of the proportion of people +> who have symptomatic flu illness.» + +Thus, the CDC acknowledges that 55-60% of illnesses are symptomatic (3/5 = 60%, +and 11/20 = 55%.) We use the mid-point, 57.5%, to infer the number of asymptomatic +illnesses: + +total_illnesses = symptomatic_illnesses / .575 # Age-stratified IFR applied to countries' population pyramids The script [apply_ifr.py](apply_ifr.py) uses a handful of age-stratified -IFR estimates and applies them to countries' population pyramids, to +IFR estimates (from the chart above) and applies them to countries' population pyramids, to find their expected overall IFR assuming equal prevalence of the disease among all age groups. @@ -18,16 +75,6 @@ Of course, the real-world overall IFR will dependent on many factors: varying prevalence among age groups, underlying health conditions, access to healthcare, socioeconomic status, ethnicity, etc. -IFR estimates come from: - -1. ENE-COVID Spanish serosurvey (calculated by `calc_ifr.py`, see next section) -1. [US CDC](https://web.archive.org/web/20200911222029/https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html) (table 1) -1. [Verity et al.: Estimates of the severity of coronavirus disease 2019: a model-based analysis](https://www.thelancet.com/journals/laninf/article/PIIS1473-3099%2820%2930243-7/fulltext) (table 1) -1. [Levin et al.: Assessing the age specificity of infection fatality rates for COVID-19: systematic review, meta-analysis, and public policy implications](https://www.medrxiv.org/content/10.1101/2020.07.23.20160895v5) (table 3) -1. [Gudbjartsson et al.: Humoral Immune Response to SARS-CoV-2 in Iceland](https://www.nejm.org/doi/full/10.1056/NEJMoa2026116), - [Supplementary Appendix 1](https://www.nejm.org/doi/suppl/10.1056/NEJMoa2026116/suppl_file/nejmoa2026116_appendix_1.pdf) (table S7) -1. [O’Driscoll et al.: Age-specific mortality and immunity patterns of SARS-CoV-2 infection in 45 countries](https://www.medrxiv.org/content/10.1101/2020.08.24.20180851v1) (table S4) - Data for the population pyramids comes from the [United Nations](https://population.un.org/wpp/Download/Standard/Population/), specifically the first sheet of [Population by Age Groups - Both Sexes](https://population.un.org/wpp/Download/Files/1_Indicators%20%28Standard%29/EXCEL_FILES/1_Population/WPP2019_POP_F07_1_POPULATION_BY_AGE_BOTH_SEXES.xlsx). This excel file was converted to CSV format: diff --git a/covid_vs_flu.png b/covid_vs_flu.png new file mode 100644 index 0000000..f0ae2dd Binary files /dev/null and b/covid_vs_flu.png differ diff --git a/covid_vs_flu.py b/covid_vs_flu.py new file mode 100755 index 0000000..b28a6b7 --- /dev/null +++ b/covid_vs_flu.py @@ -0,0 +1,375 @@ +#!/usr/bin/python3 +# +# Author: Marc Bevand — @zorinaq + +import sys +import matplotlib.pyplot as plt +import matplotlib.ticker as ticker +import numpy as np +import scipy.stats +from scipy.optimize import curve_fit + +maxage = 100 + +# Age-stratified IFR estimates for COVID-19 +ifrs_covid = [ + + # Calculated from Spanish ENE-COVID study + # (see calc_ifr.py) + ('ENE-COVID', { + (0,9): 0.003, + (10,19): 0.004, + (20,29): 0.015, + (30,39): 0.030, + (40,49): 0.064, + (50,59): 0.213, + (60,69): 0.718, + (70,79): 2.384, + (80,89): 8.466, + (90,maxage): 12.497, + }), + + # US CDC estimate as of 10 Sep 2020 + # https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html + # (table 1) + ('US CDC', { + (0,19): 0.003, + (20,49): 0.02, + (50,69): 0.5, + (70,maxage): 5.4, + }), + + # Verity et al. + # https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30243-7/fulltext + # (table 1) + ('Verity', { + (0,9): 0.00161, + (10,19): 0.00695, + (20,29): 0.0309, + (30,39): 0.0844, + (40,49): 0.161, + (50,59): 0.595, + (60,69): 1.93, + (70,79): 4.28, + (80,maxage): 7.80, + }), + + # Levin et al. + # https://www.medrxiv.org/content/10.1101/2020.07.23.20160895v5 + # (table 3) + ('Levin', { + (0,34): 0.004, + (35,44): 0.06, + (45,54): 0.2, + (55,64): 0.7, + (65,74): 2.3, + (75,84): 7.6, + (85,maxage): 22.3, + }), + + # Perez-Saez et al. + # https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30584-3/fulltext + ('Perez-Saez', { + (5,9): 0.0016, + (10,19): 0.00032, + (20,49): 0.0092, + (50,64): 0.14, + (65,maxage): 5.6, + }), + + # Poletti et al. + # https://www.eurosurveillance.org/content/10.2807/1560-7917.ES.2020.25.31.2001383 + # (table 1, column "Any time") + ('Poletti', { + (0,19): 0, + (20,49): 0, + (50,59): 0.46, + (60,69): 1.42, + (70,79): 6.87, + (80,maxage): 18.35, + }), + + # Picon et al. + # https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7493765/ + # (table 2) + ('Picon', { + (20,39): 0.08, + (40,59): 0.24, + (60,maxage): 4.63, + }), + + # Gudbjartsson et al.: Humoral Immune Response to SARS-CoV-2 in Iceland + # https://www.nejm.org/doi/full/10.1056/NEJMoa2026116 + # Supplementary Appendix 1 + # https://www.nejm.org/doi/suppl/10.1056/NEJMoa2026116/suppl_file/nejmoa2026116_appendix_1.pdf + # (table S7) + ('Gudbjartsson', { + (0,70): 0.1, + (71,80): 2.4, + (81,maxage): 11.2, + }), + + # Public Health Agency of Sweden + # https://www.folkhalsomyndigheten.se/contentassets/53c0dc391be54f5d959ead9131edb771/infection-fatality-rate-covid-19-stockholm-technical-report.pdf + # (table B.1) + ('PHAS', { + (0,49): 0.01, + (50,59): 0.27, + (60,69): 0.45, + (70,79): 1.92, + (80,89): 7.20, + (90,maxage): 16.21, + }), + + # O’Driscoll et al.: Age-specific mortality and immunity patterns of SARS-CoV-2 infection in 45 countries + # https://www.medrxiv.org/content/10.1101/2020.08.24.20180851v1 + # (table S4) + ('O’Driscoll', { + (0,4): 0.002, + (5,9): 0, + (10,14): 0, + (15,19): 0.002, + (20,24): 0.004, + (25,29): 0.009, + (30,34): 0.017, + (35,39): 0.029, + (40,44): 0.053, + (45,49): 0.086, + (50,54): 0.154, + (55,59): 0.241, + (60,64): 0.359, + (65,69): 0.642, + (70,74): 1.076, + (75,79): 2.276, + (80,maxage): 7.274, + }), + + # Ward et al.: Antibody prevalence for SARS-CoV-2 in England following first peak of the pandemic: REACT2 study in 100,000 adults + # https://www.medrxiv.org/content/10.1101/2020.08.12.20173690v2 + # Supplementary Appendix + # https://www.medrxiv.org/highwire/filestream/93745/field_highwire_adjunct_files/0/2020.08.12.20173690-1.docx + # (table S2a, column "Based on confirmed COVID-19 deaths") + ('REACT2', { + (15,44): 0.03, + (45,64): 0.52, + (65,74): 3.87, + (75,maxage): 18.71, + }), + + # Yang et al.: Estimating the infection fatality risk of COVID-19 in New York City during the spring 2020 pandemic wave + # https://www.medrxiv.org/content/10.1101/2020.06.27.20141689v2 + # (table 1) + ('Yang', { + (0,24): 0.0097, + (25,44): 0.12, + (45,64): 0.94, + (65,74): 4.87, + (75,maxage): 14.17, + }), + + # Molenberghs et al.: Belgian Covid-19 Mortality, Excess Deaths, Number of Deaths per Million, and Infection Fatality Rates + # https://www.medrxiv.org/content/10.1101/2020.06.20.20136234v1 + # (table 6) + ('Molenberghs', { + (0,24): 0.0005, + (25,44): 0.017, + (45,64): 0.21, + (65,74): 2.24, + (75,84): 4.29, + (85,maxage): 11.77, + }), + +] + +# In the CDC influenza burden pages (eg. table 1 in +# https://www.cdc.gov/flu/about/burden/2018-2019.html), only symptomatic +# illnesses are estimated. We must account for asymptomatic ones as well. +# +# In https://www.cdc.gov/flu/about/keyfacts.htm the CDC implies 55-60% of +# illnesses are symptomatic: +# +# «on average, about 8% of the U.S. population gets sick from flu each season, +# with a range of between 3% and 11%, depending on the season. +# [...] +# The commonly cited 5% to 20% estimate was based on a study that examined both +# symptomatic and asymptomatic influenza illness, which means it also looked at +# people who may have had the flu but never knew it because they didn’t have +# any symptoms. The 3% to 11% range is an estimate of the proportion of people +# who have symptomatic flu illness.» +# +# The CDC thus acknowledges that 55-60% of illnesses are symptomatic: +# 3/5 = 60% +# 11/20 = 55% +# +# We use the mid-point, 57.5%, as an estimate to account for both symptomatic +# and asymptomatic illnesses. +cdc_sympt = .575 + +# Age-stratified IFR estimates for seasonal influenza +ifrs_flu = [ + + # US CDC 2018-2019 influenza burden + # https://www.cdc.gov/flu/about/burden/2018-2019.html + ('US CDC 2018-2019', { + (0,4): 266/3_633_104 * 100 * cdc_sympt, + (5,17): 211/7_663_310 * 100 * cdc_sympt, + (18,49): 2_450/11_913_203 * 100 * cdc_sympt, + (50,64): 5_676/9_238_038 * 100 * cdc_sympt, + (65,maxage): 25_555/3_073_227 * 100 * cdc_sympt, + }), + + # US CDC 2017-2018 influenza burden + # https://www.cdc.gov/flu/about/burden/2017-2018.htm + ('US CDC 2017-2018', { + (0,4): 115/3_678_342 * 100 * cdc_sympt, + (5,17): 528/7_512_601 * 100 * cdc_sympt, + (18,49): 2_803/14_428_065 * 100 * cdc_sympt, + (50,64): 6_751/13_237_932 * 100 * cdc_sympt, + (65,maxage): 50_903/5_945_690 * 100 * cdc_sympt, + }), + + # US CDC 2016-2017 influenza burden + # https://www.cdc.gov/flu/about/burden/2016-2017.html + ('US CDC 2016-2017', { + (0,4): 126/2_381_218 * 100 * cdc_sympt, + (5,17): 125/6_452_110 * 100 * cdc_sympt, + (18,49): 1_365/9_292_804 * 100 * cdc_sympt, + (50,64): 3_780/7_448_184 * 100 * cdc_sympt, + (65,maxage): 32_833/3_646_206 * 100 * cdc_sympt, + }), + + # US CDC 2015-2016 influenza burden + # https://www.cdc.gov/flu/about/burden/2015-2016.html + ('US CDC 2015-2016', { + (0,4): 180/2_195_276 * 100 * cdc_sympt, + (5,17): 88/4_140_269 * 100 * cdc_sympt, + (18,49): 1_703/9_121_242 * 100 * cdc_sympt, + (50,64): 3_277/6_640_358 * 100 * cdc_sympt, + (65,maxage): 17_458/1_407_174 * 100 * cdc_sympt, + }), + + # US CDC 2014-2015 influenza burden + # https://www.cdc.gov/flu/about/burden/2014-2015.html + ('US CDC 2014-2015', { + (0,4): 396/3_207_314 * 100 * cdc_sympt, + (5,17): 407/6_388_401 * 100 * cdc_sympt, + (18,49): 985/8_606_083 * 100 * cdc_sympt, + (50,64): 4_780/7_283_766 * 100 * cdc_sympt, + (65,maxage): 44_808/4_679_888 * 100 * cdc_sympt, + }), + +] + +def col(is_covid, i): + if is_covid: + return plt.cm.OrRd(255 - i * 8) + else: + return plt.cm.Blues(255 - i * 30) + +def plot(ax, ifrs, is_covid): + lstyles = ('solid', 'dashed', 'dashdot', 'dotted') + markers = ('o', 's', 'v', '^', '<', '>', 'P', '*', 'X', 'D', 'p') + i = 0 + for ifr in ifrs: + name, ifr_by_age = ifr + x, y = [], [] + for age_group, ifr_val in sorted(ifr_by_age.items()): + # place the marker at the middle (mean) of the age group + x.append(np.mean(age_group)) + y.append(ifr_val) + ax.plot(x, y, color=col(is_covid, i), label=name, lw=1, alpha=.8, + marker=markers[i % len(markers)], ms=4, + ls=lstyles[i % len(lstyles)]) + i += 1 + +def interpolate(age, x1, y1, x2, y2): + def func_exp(x, a, b): + return a * (b ** x) + popt, pcov = curve_fit(func_exp, [x1, x2], [y1, y2]) + return func_exp(age, *popt) + +def ifr_for_model(age, ifr_model): + # calculate IFR for age + m_prev = ifr_prev = None + # iterate over the age groups in order + for age_group, ifr in sorted(ifr_model[1].items()): + m = np.mean(age_group) + if m == age: + return ifr + if m > age: + if ifr_prev == None: + sys.stderr.write(f'{ifr_model[0]}: no data, age {age} too young\n') + return None + if ifr_prev == 0 or ifr == 0: + sys.stderr.write(f'{ifr_model[0]}: ignoring IFR zero for age {age}\n') + return None + return interpolate(age, m_prev, ifr_prev, m, ifr) + m_prev, ifr_prev = m, ifr + sys.stderr.write(f'{ifr_model[0]}: no data, age {age} too old\n') + return None + +def mean_ifr(age, ifr_models): + # calculate the geometric mean of IFR estimates in for age + values = [] + for ifr_model in ifr_models: + ifr = ifr_for_model(age, ifr_model) + if ifr != None: + values.append(ifr) + return scipy.stats.gmean(values) + +def plot_comp(ax): + for age in np.arange(30, 90, 10): + y1 = mean_ifr(age, ifrs_flu) + y2 = mean_ifr(age, ifrs_covid) + assert not np.isnan(y1) and not np.isnan(y2) + ax.annotate(s='', xy=(age, y1), xytext=(age, y2), + arrowprops=dict(arrowstyle='|-|', shrinkA=0, shrinkB=0, + alpha=.7)) + ax.text(age, y1 * .6, f'{y2/y1:.1f}×', ha='center', va='top', + weight='bold', size=12, alpha=.7) + +def main(): + (fig, ax) = plt.subplots(dpi=300, figsize=(8,6)) + # plot ifrs_covid + plot(ax, ifrs_covid, True) + ax.text(3, 35, 'COVID-19:') + handles, labels = fig.gca().get_legend_handles_labels() + first_legend = ax.legend(handles=handles, labels=labels, loc='upper left', + frameon=False, fontsize='x-small', handlelength=5) + fig.gca().add_artist(first_legend) + # plot ifrs_flu + plot(ax, ifrs_flu, False) + # plot vertical comparison bars + plot_comp(ax) + ax.semilogy() + ax.grid(True, which='minor', linewidth=0.1) + ax.grid(True, which='major', linewidth=0.3) + ax.spines['top'].set_visible(False) + ax.spines['bottom'].set_visible(True) + ax.spines['left'].set_visible(True) + ax.spines['right'].set_visible(False) + ax.set_ylabel('IFR (%)') + ax.set_xlabel('Age') + ax.set_xlim(left=0) + ax.xaxis.set_minor_locator(ticker.MultipleLocator(base=5)) + ax.xaxis.set_major_locator(ticker.MultipleLocator(base=10)) + ax.yaxis.set_major_formatter(ticker.FormatStrFormatter('%g')) + ax.text(75, .0016, 'Seasonal Influenza:') + handles, labels = fig.gca().get_legend_handles_labels() + x = len(ifrs_flu) + ax.legend(handles=handles[-x:], labels=labels[-x:], loc='lower right', + frameon=False, fontsize='x-small', handlelength=5) + fig.suptitle('Infection Fatality Ratio of COVID-19 vs. Seasonal Influenza') + ax.text(0, -0.11, + 'Source: https://github.com/mbevand/covid19-age-stratified-ifr\n' + 'Note: the vertical lines on two COVID-19 IFR curves (Poletti and O’Driscoll) are caused by the IFR being\n' + 'estimated to be zero by Poletti for age groups 0-19 and 20-49, and by O’Driscoll for 5-9 and 10-14.\n', + transform=ax.transAxes, fontsize='small', verticalalignment='top', + ) + ax.text(1, 1, 'Created by: Marc Bevand — @zorinaq', + transform=ax.transAxes, fontsize='xx-small', va='top', ha='right') + fig.savefig('covid_vs_flu.png', bbox_inches='tight') + plt.close() + +if __name__ == '__main__': + main()