web analytics

Some graphs

This might be a bit technical and long… You have been warned…

I have been working on repeating the analysis done by Financial Times on plotting cumulative cases and deaths from COVID-19, to illustrate what has been happening in different countries over the last month or so. The FT analysis is free to view but not to republish and I also wanted to do some more analysis on it.

I am using global and freely available data from Johns Hopkins University. The data are what they are – there are many caveats, reporting rate changes in different countries, and there is clearly massive underreporting. This is particularly difficult for deaths, as countries use very different definitions of what a case of death from COVID-19 mean – nobody dies directly from the virus, but from complications following the infection. Keeping this in mind, let’s look at some countries (more countries are shown in the FT graphs; if your favourite one is missing here, please let me know and I can add it).

The first plot shows the cumulative number of reported cases (i.e. all cases up to that day) since the 100 case threshold is reached. In other words, we shift the notifications to the ‘day 0’ corresponding to the actual date when the country reached 100 cases. Some countries will have reached this level earlier (China before 22nd January when the records start) and some later; we are simply tracing the growth of the epidemic to the same start. The advantage of this is that hopefully, this will allow us to discover a common pattern: if the UK follows Italy, but is say 2 weeks later, on our graph the two curves will be very similar.

I will do one more thing and will use a logarithmic scale on the vertical axis. This has to do with the fact that when diseases like COVID-19 spread, they tend to grow a bit like compound interest on your bank account. Thus, the growth is faster and faster, and the more cases we have on a particular day, proportionally more cases will come the next day. It is useful to think in terms of the time it takes to double the number of cases, and they can double every day, every two days, three days, a week, two weeks, etc. Of course, if the cases double every day, this means they grow much faster than if they double every week, for example. The advantage of a logarithmic scale is that if the cases were doubling every day, the graph would look like a straight line and hence this behaviour would be easy to spot in the data.

So, let’s have a look at the first set of graphs, showing the number of cumulative cases since the 100th case.

I grouped the countries roughly according to their general patterns:

Cumulative cases since 100th case by 31st March 2020. Broken straight lines show exponential growth with doubling time: day, 3 days and a week.

Each step line in the figure above is a record for a single country; starting on the bottom left on the day they recorded more than 100 cases, and ending on 31st March at the end of the line on the top right. Thus, China record (black line in the first graph) starts at over 500 cases on the 22nd January and ends with 82 279 cases on 31st March. South Korea (red) recorded 100 cases later than 22nd January and hence their epidemic has lasted for fewer days and the so the red line is shorter than in China. Malta (light blue) only recorded 100 cases few days ago and so their line is very short. The end of all lines corresponds to the 31st March.

The first of these graphs includes countries where the control seems to work to some extend. China, shown in black, started with a very steep growth, doubling the number of cases every day, but quickly managed to slow down until the cases do not really grow any more (this is the horizontal line extending to the right). South Korea (red) is similar, as it implemented successful control measures quite early; it differs from China as the number of cases is still growing, although quite slowly. Japan (green), Singapore (blue) and Malta (light blue) are examples of countries where for very different reasons, the spread is slow, with the number of cases doubling every week or more. Both South Korea and Singapore used testing to slow down the disease progress, and people in Japan potentially were protected by a BCG vaccine.

The second group of countries are here classed as ‘Western European countries’, although I am sure there will be others that have a very similar trajectory. The number of cases roughly doubles every three days or faster, although there is some evidence of a slow down (Italy: black, Spain: red, France: green, Germany: light blue), except in the UK (dark blue line) which still largely follows a straight line. Interestingly, Scotland (black dots) appears to progress slower than the rest of the UK.

The third graph shows some Central/Eastern European countries, mainly because I am interested in how Poland (red line) fares. The overall numbers and rate are somewhat smaller than in the previous graph and the trajectory seems to slow down earlier. Maybe this is because they started the lock down earlier.

The final graph is a slightly mixed bag. There is the US (black line) where the growth is faster than for the UK and other Western European countries and does not show much slowing down. Brazil (red) had initially a similar rate of spread but is perhaps now slowing down. Finally, India (blue, again because I am interested in how the virus spreads there) is relatively slow, but is picking up now.

Cumulative deaths since 100th case by 31st March 2020. Broken straight lines show exponential growth with doubling time: day, 3 days and a week.

Finally, death records show a similar story, although not much of stopping except for China (deaths records start at 10 rather than 100). This could be because deaths only occur after several days or even weeks from the case recording. The death record is, therefore, a bit like the record of the cases some time ago and so we are still too early to see the slowing down of the cases reflected in the deaths. On the other hand, we believe that death record is a more reliable reflection of the disease progress, as case reporting is notoriously difficult to interpret and liable to change as testing frequency changes.

So what do we conclude from these graphs? There is some cautious optimism as some countries managed to stop or slow down the spread, and we start seeing a bit of slowing down in Italy. The worrying side is the US where the growth is still fast and exponential and the number of deaths rises up very fast. I am also pleased that Scotland seems to be relatively protected – is that because of our diet of haggis and whisky?