Update w/results from 2nd round of Spanish survey

This commit is contained in:
Marc Bevand
2020-06-10 09:40:36 -07:00
parent 3e69ef4ee7
commit 8c1eabc260
2 changed files with 75 additions and 66 deletions
+36 -26
View File
@@ -1,40 +1,49 @@
# Calculating the age-stratified infection fatality ratio (IFR) of COVID-19
*Updated: 09 June 2020*
*Updated: 10 June 2020*
Author: Marc Bevand
The [largest serological prevalence survey][sero] of COVID-19 was conducted in
Spain on 60 897 valid samples between 27 April and 11 May. We used its results
to calculate the overall and age-stratified IFR of COVID-19 with the Python
script `calc_ifr.py`:
The largest serological prevalence survey of COVID-19 was conducted by Spain
during the second round of a study that analyzed 63 564 samples between 18 May
2020 and 01 June 2020. We used its [provisional results][sero] published on 03
June to calculate the overall and age-stratified IFR of COVID-19 with the
Python script `calc_ifr.py`:
```
$ ./calc_ifr.py
Ages 0 to 9: 109803 infected, 3 deaths, 0.003% IFR
Ages 10 to 19: 180401 infected, 7 deaths, 0.004% IFR
Ages 20 to 29: 216507 infected, 33 deaths, 0.015% IFR
Ages 30 to 39: 261550 infected, 87 deaths, 0.033% IFR
Ages 40 to 49: 436122 infected, 281 deaths, 0.065% IFR
Ages 50 to 59: 412847 infected, 864 deaths, 0.209% IFR
Ages 60 to 69: 313907 infected, 2363 deaths, 0.753% IFR
Ages 70 to 79: 256631 infected, 6470 deaths, 2.521% IFR
Ages 80 to 89: 123416 infected, 10982 deaths, 8.898% IFR
Ages 90 to 199: 33807 infected, 5654 deaths, 16.724% IFR
Ages 0 to 199: 2344992 infected, 26744 deaths, 1.140% IFR
Ages 0 to 9: 115013 infected, 4 deaths, 0.003% IFR
Ages 10 to 19: 177929 infected, 7 deaths, 0.004% IFR
Ages 20 to 29: 212099 infected, 32 deaths, 0.015% IFR
Ages 30 to 39: 281290 infected, 86 deaths, 0.030% IFR
Ages 40 to 49: 447942 infected, 287 deaths, 0.064% IFR
Ages 50 to 59: 410213 infected, 874 deaths, 0.213% IFR
Ages 60 to 69: 334709 infected, 2404 deaths, 0.718% IFR
Ages 70 to 79: 270572 infected, 6451 deaths, 2.384% IFR
Ages 80 to 89: 131703 infected, 11150 deaths, 8.466% IFR
Ages 90 to 199: 46631 infected, 5827 deaths, 12.497% IFR
Ages 0 to 199: 2428102 infected, 27121 deaths, 1.117% IFR
```
The average IFR for Spain is **1.140%**. However the true IFR may be higher due
The average IFR for Spain is **1.117%**. However the true IFR may be higher due
to right-censoring and under-reporting of deaths.
The Spanish serological study was conducted between 27 April 2020 and 11 May 2020 and
remains the largest published study available to this day. The age-stratified
IFR was calculated from three sources:
The Spanish serological study remains the largest published study available to
this day. The age-stratified IFR was calculated from three sources:
1. Detailed *prevalence data for age brackets*, from the [serosurvey][sero] (page 8)
1. *Total deaths* and *deaths per age bracket* from the [Ministry of Health's daily report for 11 May][deaths] (page 1 and table 3)
1. Detailed *prevalence data for age brackets*, from the [serosurvey][sero] (table 1)
1. *Total deaths* and *deaths per age bracket* from the [Ministry of Health's daily report for 29 May][daily] (table 2 and table 3)
1. *Population pyramid* for Spain, from [worldpopulationreview.com][wpop]
In order to minimize right-censoring, the parameters *total deaths* and *deaths
per age bracket* should be obtained from a point in time as close as possible
to when the serosurvey was conducted (18 May to 01 June.) We found only two
Ministry of Health reports in this time period that document deaths per age
brackets: [18 May][dailyalt], [29 May][daily]. However the Ministry of Health
has made significant corrections to deaths statistics on 25 May by subtracting
approximately 2 000 deaths. Therefore we trusted the statistics from 29 May over
those of 18 May.
Important detail to note: there were 26 744 total deaths, however age information
was only available for 18 722 deaths, and was missing for 8 022 deaths.
We assume that these 8 022 deaths were distributed proportionally—not equally—among age
@@ -47,18 +56,19 @@ another population pyramid, thus calculating the expected average IFR for other
countries.
In the second half of the script, edit `pyramid_target` with the demographics data.
As an example, we supply pyramid data for the United States and calculate an IFR of **0.721%**:
As an example, we supply pyramid data for the United States and calculate an IFR of **0.658%**:
```
$ ./calc_ifr.py
[...]
IFR on target country assuming disease prevalence equal among ages: 0.721%
IFR on target country assuming disease prevalence equal among ages: 0.658%
```
However IFR is highly dependent on factors other than age: availability
of healthcare, population health, etc, so this estimate should be interpreted
with caution.
[sero]: https://www.mscbs.gob.es/gabinetePrensa/notaPrensa/pdf/13.05130520204528614.pdf
[deaths]: https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_102_COVID-19.pdf
[sero]: https://portalcne.isciii.es/enecovid19/ene_covid19_inf_pre2.pdf
[daily]: https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_120_COVID-19.pdf
[dailyalt]: https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_109_COVID-19.pdf
[wpop]: https://worldpopulationreview.com/countries/spain-population/
+39 -40
View File
@@ -1,57 +1,56 @@
#!/usr/bin/python3
#
# Calculate the age-stratified IFR based on the Spanish serosurvey of 60897 participants.
# Calculate the age-stratified IFR based on the second round of the Spanish
# serosurvey of 63564 participants.
# Author: Marc Bevand — @zorinaq
# Prevalence of antibodies by age bracket, in % (serosurvey dates: 27-April-2020 to 11-May-2020)
# Source: https://www.mscbs.gob.es/gabinetePrensa/notaPrensa/pdf/13.05130520204528614.pdf (page 8)
# Prevalence of antibodies by age bracket, in % (serosurvey dates: 18-May-2020 to 01-June-2020)
# Source: https://portalcne.isciii.es/enecovid19/ene_covid19_inf_pre2.pdf (table 1)
prevalence_by_age = {
(0,0): 1.1,
(1,4): 2.2,
(5,9): 3.0,
(10,14): 3.9,
(0,0): 2.2,
(1,4): 2.4,
(5,9): 2.9,
(10,14): 3.8,
(15,19): 3.8,
(20,24): 4.5,
(25,29): 4.8,
(30,34): 3.8,
(35,39): 4.6,
(40,44): 5.3,
(45,49): 5.7,
(50,54): 5.8,
(55,59): 6.1,
(60,64): 5.9,
(65,69): 6.2,
(70,74): 6.9,
(75,79): 6.1,
(20,24): 4.2,
(25,29): 4.9,
(30,34): 4.4,
(35,39): 4.7,
(40,44): 5.4,
(45,49): 5.9,
(50,54): 6.1,
(55,59): 5.7,
(60,64): 6.3,
(65,69): 6.6,
(70,74): 7.3,
(75,79): 6.4,
(80,84): 5.1,
(85,89): 5.6,
(90,199): 5.8,
(85,89): 6.4,
(90,199): 8.0,
}
# Total deaths, and number of deaths by age bracket (as of 11-May-2020)
# Source: https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_102_COVID-19.pdf (page 1 and table 3)
# Total deaths (26744) differs from the total for all age brackets (18722)
# because age information is not available for 8022 deaths, as explained in
# table 3 header: «Distribución de casos hospitalizados, ingresados en UCI y
# fallecidos por grupos de edad y sexo información disponible»
total_deaths = 26744
# Total deaths, and number of deaths by age bracket (as of 29-May-2020)
# Source: https://www.mscbs.gob.es/profesionales/saludPublica/ccayes/alertasActual/nCov-China/documentos/Actualizacion_120_COVID-19.pdf (table 2 and table 3)
# Total deaths (27121) differs from the total for all age brackets (20585)
# because age information is not available for 6536 deaths
total_deaths = 27121
deaths_by_age = {
(0,9): 2,
(0,9): 3,
(10,19): 5,
(20,29): 23,
(30,39): 61,
(40,49): 197,
(50,59): 605,
(60,69): 1654,
(70,79): 4529,
(80,89): 7688,
(90,199): 3958,
(20,29): 24,
(30,39): 65,
(40,49): 218,
(50,59): 663,
(60,69): 1825,
(70,79): 4896,
(80,89): 8463,
(90,199): 4423,
}
deaths_by_age[(0,199)] = total_brackets = sum(deaths_by_age.values()) # 18722
deaths_by_age[(0,199)] = total_brackets = sum(deaths_by_age.values()) # 20585
# To properly calculate the IFR, we need to account for the extra 8022 deaths
# To properly calculate the IFR, we need to account for the extra 6536 deaths
# for which age information was not available, so we simply assume they are
# distributed proportionally among age brackets
# distributed proportionally (not equally) among age brackets
for bracket in deaths_by_age:
deaths_by_age[bracket] *= (total_deaths / total_brackets)