Modeling Superspreaders
There is increasing evidence that the Covid-19 pandemic is spread primarily by superspreaders. Let’s define this in four levels of increasing technicality:
The majority of people who get sick don’t infect anybody else
The mean number of people every sick person infects is higher than the median number of people each sick person infects.
10% of infected are responsible for 80% of new infections
The distribution of secondary infections any given sick person will be responsible for can be estimated with the negative binomial(R0, k), where k is the dispersion factor, and is estimated to be ~= .16 for Covid-19 and similar coronaviruses (like the SARS epidemic of the early 2000s).
Althouse et. al. (2020), describe four types of SSEs, I am going to list them verbatim here:
Biological. Individuals with a higher probability of transmitting per contact. These may be hard to identify a priori; for infections with most pathogens, pathogen loads may vary over many orders of magnitude across individuals32, 33. For SARS-CoV-2 individual-level viral loads are dependent on time since onset, and might be associated with demographics like age and disease severity34. The temporal profile of SARS-CoV-2 viral load peaks at or just before the onset of symptoms and decreases quickly to near the PCR detection threshold within a week29, 34, 35. Large heterogeneities in viral load – up to 8 log10 differences – exist between individuals36 .
Behavioral / social. Individuals causing SSEs may have a higher number of susceptible contacts per person. Numerous studies have demonstrated marked differences in individual contacts by profession and over time.
High-risk facilities and places such as meat-packing plants, workers’ dormitories, prisons, long term care facilities, or health care settings. The nature of interactions in these places seem to repeatedly place individuals at higher risk of acquiring and transmitting infection. Importantly, this driver of SSEs can lead to spillover into the larger community. Controlling a broader outbreak will be very difficult if there is a focal hotspot which is continually seeding new transmission chains, as seen in other respiratory pathogens such as tuberculosis37, 38 .
‘Opportunistic’ scenarios. The first scenario is when larger numbers of individuals temporarily cluster, and even with an average probability of transmission per contact, people are briefly far above their ‘average’ number of susceptible contacts. The second scenario is probability of transmission per contact is temporarily increased in an unusual way, such as singing or frequent loud speaking. These two opportunistic scenarios are more frequently seen in outbreaks at night clubs, cruise ships, crowded public transportation, parties, choirs, or other mass gathering events.
What we want to know for building a model is whether the variations in second order infections each person is responsible for are caused by inherent properties of the virus or behavior of individuals/social structures individuals operate within. In reality, it’s almost certainly a combination of the two. One person may be shedding more of the virus both because he has a higher than average viral load, and also because he talks very loudly. Another person may be the cause of an SSE because, despite having an average viral load, and not being a particularly loud talker, spends a lot of time at a very high risk event (ie a packed casino). There is undoubtedly more subtlety in the real world than we could possibly pack into our models. That being said, let’s try to find a happy, simplified medium that seems somewhat realistic when trying to account for the causes of SSEs.
What do we know from the data that’s out there?
Social behavior clearly matters. Choir rehearsals or birthday parties seem to be very good scenarios for the virus. There have been (to my knowledge) no reports of SSEs that did not take place indoors.
However, let’s not completely discount the role of viral expression. There’s evidence that even in the majority of cases where one member of a household has Covid-19, the other members of the household don’t contract it. I found this very surprising, and this seems to clearly indicate that some people who are sick with this virus simply end up being much more contagious than others, and not purely as a result of their behavior or environment.
So how should we model this? Viral characteristics and behavior both seem to play a role, so let’s start by having two, independent variables drive SSEs:
When each person gets sick, the transmissibility (the chance that a healthy person who comes into physical contact with them gets sick), will be determined by a random number between 0 and 1 generated by a yet to be defined probability distribution. This number is made to account for both the person’s viral load, and also any behavior (like loud talking, or a penchant for coughing without covering their mouth), that makes the person more likely to infect others. This number will stay the same for the length of their illness (note: this last part is probably not realistic, the infectivity of somebody sick with Covid-19 does seem to vary throughout the course of the disease, however, I’m not convinced that ignoring this fact actually makes the model less useful at revealing the dynamics of infection we’re interested in)
Each person, healthy or sick, will have a predetermined, and randomly assigned, sociability factor. This factor could influence behavior in a number of ways, ranging from the very simple (higher sociability means moving farther each turn), to the more complex (higher sociability means more likely to move towards other people in the simulation, creating clumps of socializers).