1
Multidisciplinary Task and Finish Group on Mass Testing
Consensus Statement for SAGE
Date: 31 August 2020
1. This consensus statement presents the findings and recommendations of the SAGE Task and Finish Group on
Mass Screening (TFMS). The TFMS is a multidisciplinary group established to examine, from technological,
epidemiological, and behavioural perspectives, the benefits and challenges of mass testing for SARS-CoV-2.
2. TFMS adopts the terminology ‘mass testing’, or ‘population-level case detection’ (PCD), which refers to regular
and/or large-scale testing of whole populations defined by area or sector regardless of whether or not they
have symptoms. Mass testing is a distinct strategy and set of technologies to NHS Test and Trace (NHSTT).
Rather than testing self-reported, symptomatic individuals, mass testing involves pro-active asymptomatic
testing of a defined group; either through universal provision of accessible testing to that group or as a
requirement before entering a particular setting.
Key Recommendations
3. Mass testing is a different strategy for finding infectious people from contact tracing, however any mass
testing system should be a carefully designed counterpart to the NHSTT contact tracing system. It will be
important that the two systems are complementary and linked-up and that all infectious people found through
mass testing are reported to NHSTT.
4. Clear and specific aims for any mass testing programme should be defined, ensuring those objectives include
achieving equitable outcomes. Mass testing is primarily a tool for control of infection: to lower R in the general
population (by reducing the average infectious period); to reduce the risk of larger outbreaks in areas known
to be of concern; or to increase access to venues and settings by reducing the probability that anyone present
is infectious.
5. Testing should be driven by public health considerations and priorities. Priority groups for mass testing to
reduce transmission should be identified according to their likely contribution to reducing R and outbreak
risks, and improving health, social and economic outcomes including reducing inequalities. Mass testing is
most likely to be beneficial and feasible in cluster outbreak scenarios and well-defined higher risk settings (e.g.
health and social care settings, higher risk occupations such as food production facilities, and universities),
where it can help detect and prevent large outbreaks early, and compliance can be measured and moderated.
6. Establishing a new mass testing programme must be undertaken with a view to the entire end-to-end
system - testing technology is only one component. The effectiveness of a mass testing programme will
depend on the proportion tested, frequency of testing, ability of a test to identify true positives and negatives,
speed of results and subsequent adherence to isolation. From highly accessible fast turn-around testing to
structured financial and other support (particularly for most disadvantaged groups), a system-wide capability
must be built.
7. The cheaper, faster tests that will be useful for mass testing are likely to have lower ability to identify true
positives (lower sensitivity) and true negatives (lower specificity) than the tests currently used in NHSTT.
Problems of low sensitivity would be decreased by very frequent testing. In populations with low prevalence
of infection, mass testing that lacks extremely high specificity would result in many individuals receiving false
positive results. In such circumstances, rapid follow-up confirmatory testing will be needed to determine
whether individuals should continue to self-isolate it is important to rapidly isolate infectious individuals, but
efforts will be needed to quickly release false positives.
2
8. Careful consideration should be given to ensure that any mass testing programme provides additional
benefit over investing equivalent resources into improving (i) the speed and coverage of NHSTT for
symptomatic cases (the proportion of individuals who report Covid-consistent symptoms in England who go
on to request a test through NHSTT could be as low as 10%
1
) and (ii) the rate of self-isolation and quarantine
for those that test positive (currently estimated to be <20% fully adherent
2
). This is relevant as targeting testing
to those with high prior probabilities of infection (e.g. people with symptoms or contact with known case) has
a much larger per-test impact on reducing transmission. There is, therefore, a delicate balance to be struck
between investing to engage more symptomatic individuals with NHSTT and building alternative methods to
reach out to find those who would not seek testing spontaneously.
9. The use of testing as a point-of-entry requirement for particular settings and events, e.g. sporting and
cultural events, could play a role in allowing the resumption of such activities with reduced risk of
transmission. Such applications of testing would require superb organisation and logistics with rapid, highly-
sensitive tests. This is also separate from the national strategy to reduce R, for which such testing would have
only minimal effect.
10. A further application could be to provide reassurance in sensitive settings where detection of one or two
infectious individuals could be followed up by local but broad testing (e.g. of all pupils in the class or all team
members in the workplace).
3
Overarching considerations
11. The population prevalence of infection and relevant test performance have critical implications for
effectiveness and risks of mass testing. Mass testing could enable earlier detection of infection clusters, but
it is essential to consider the implications of both false positives and false negatives. In a population with very
low prevalence twice weekly tests with 99% specificity and 10 day isolation if positive - would lead to ~3% in
isolation at any given time (as 1% isolated every 3.5 days) and 41% of the population receiving a false positive
over 6 months (i.e. the probability of getting at least one false positive in 52 tests). This example highlights
the importance of follow-up confirmatory testing in low prevalence settings, addressed in Annex B. Finding
the right combination of tests is not trivial and will require careful pilot studies.
12. Under mass testing, a larger proportion of positive results will be false positive than in symptomatic testing,
even when using the same test, as infection prevalence is much lower in asymptomatic populations. The
response to positive tests will therefore require careful consideration, including (i) whether rapid follow-up
confirmatory testing is used to avoid prolonged isolation of large numbers of false positives, (ii) whether
individual isolation requires household quarantine, (iii) how to communicate to the public the nature of mass
testing and lower-confidence test results to avoid potential undermining of public perception and confidence
in testing (iv) the possible impacts on individuals (loss of earnings) and groups (closure of schools), and (v) how
to define outbreaks when including asymptomatic tests.
13. Effective mass testing will require high rates of testing and self-isolation. Current rates of (i) symptom
identification, (ii) test requests, and (iii) subsequent self-isolation are estimated to be very low (see paragraph
8), with none likely exceeding 30%. This presents a critical barrier to any effective mass testing strategy. This
will require engagement built on trust, shared goals and perceived fairness. Messaging must be co-produced
with target communities and include transparent rationales and benefits of testing, allayment of privacy
concerns, and specified support for positive cases.
14. How any test is offered is a major predictor of uptake rates. Two key considerations for mass testing are: (i)
Accessibility of testing rates are increased by multiple low-friction points for access, e.g. walk-in centres. It
might prove useful to have mobile laboratories for taking tests and instrumentation to communities that need
them. Tests self-administered at home can increase accessibility but depend upon distribution and effective
communication. (ii) Access that is dependent on testing (e.g. requiring testing as pre-condition for entry to
a workplace, university or event) has unknown effects on discouragement and equity.
15. Mass testing can only lead to decreased transmission if individuals with a positive test rapidly undertake
effective isolation. This needs to become a universal response to receipt of a positive test result and may
need structured financial and social support both to promote self-isolation and mitigate impacts on
inequalities. This should include (i) proactive provision of information and social and clinical support, (ii)
sufficient supplies of food, (iii) employment protection, (iv) financial assistance and (v) accommodation where
necessary. See supporting TFMS Behavioural Paper for summary of evidence regarding structured support to
improve adherence to self-isolation and quarantine guidelines.
16. Mass testing requires a systems view. Testing is a complex end-to-end process, and the test instrument or
assay is only one part of the testing system; the performance of the entire system must be evaluated. For
example, an instrument that can run 10,000 tests per hour still requires 10,000 samples to be taken,
transported to the laboratory when not in situ, labelled and recorded with patient metadata, subsamples or
extracts prepared from 10,000 primary samples, tests run, and the results captured and fed into NHSTT. The
system turn-around time and system costs are always greater than the instrument or assay turn-around time
and cost. Successfully deploying and scaling mass testing will therefore require a system-wide capability
management view.
5
19. It is well known that widespread use of a test with imperfect specificity in a population with low prevalence
will generate more false positives than true positives. For example, suppose one were to test 100,000 people
of whom 200 were infected and 99,800 were not infected. A test with 80% sensitivity and 96% specificity
would find 160 true positives and 3,992 false positives. The situation can be rectified with follow-up,
confirmatory testing of the 4,152 with a positive first test using a different second test that has very high
specificity (perhaps at greater expense or slower turnaround). If those 4,152 individuals were re-tested with a
test with 99% specificity the number of false positives would fall to just 40 see Annex B for detailed
illustrative examples. This calculation relies on the strong assumption that the two tests have independent
errors.
20. Choosing the right tests from the available technologies is an essential step in building a viable mass testing
system. Those tests need to have the right properties for the specified objective (e.g. fast and highly sensitive
to allow access to a sporting or cultural event). Objectives that require combined tests should use technologies
that, as far as possible, ensure independent errors so that a confirmatory test has a good chance of revealing
errors in a first test. A panel of reference samples of known status will need to be developed to continuously
quality assure any mass testing service. For example, if a private sector lab were delivering a screening service
for a sporting or cultural event, they would routinely need to be tested using a panel of reference samples.
9
Annex A
The compromise between test sensitivity and specificity
A nasopharyngeal swab sample taken from someone who has very recently been infected by SARS-CoV-2 will not yet
contain any virus. Over time, as the virus replicates, the amount of virus in their swab samples will increase. It peaks
a few days after the onset of symptoms, and decreases as the person recovers. The virus is often detected for several
weeks after the person stops showing symptoms, while their body clears the remaining virus. This does not always
mean the person is still infectious. An idealised, illustrative example of the amount of virus in samples is shown in
Figure 1. If the person is sampled early during infection the swabs will be weak positive samples, as they contain very
little virus. At the onset of symptoms they will be strong positive samples, as they contain a large amount of virus.
Figure 1. Amount of virus in samples taken during SARS-CoV-2 infection. The blue curve shows an illustrative
example of the amount of virus found in swabs taken during infection, from infection at day 0 leading to symptom
onset at day 5, and an assumed infectious period of two days prior to infection until ten days after infection (adapted
from Kucirka et al., 2020).
9
When these swab samples are tested, the amount of signal they produce in a test is proportional to the amount of
virus in the swab sample. Strong positive samples will give a strong signal, weak samples will give a weak signal.
Different types of test use different types of signal the signal may be the detection of a PCR product, or luminescence,
or colour change. True negative samples (e.g. water, buffer or a sample taken from an uninfected person) can also
give a very low signal. When a test is implemented, a decision must be made about where to set the threshold level
that a signal must cross in order to be called a positive test result.
The performance of tests is described by their sensitivity (their probability of detecting a true positive, i.e. a sample
taken from an infected person), and their specificity (their probability of detecting a true negative, i.e. a sample taken
from a healthy, uninfected person). The sensitivity and specificity of a test are influenced by both the test technology,
and the detection threshold chosen (often referred to as the ‘limit of detection’). Figure 2 shows a simplified example.
10
Figure 2. Setting detection thresholds for tests. Strong positive samples (red) produce more signal in a test than
weak positive samples (green), which in turn produce more signal than negative samples (blue). The performance of
a test is decided by where the detection threshold is set.
Four examples of detection thresholds are shown in shown in Figure 2:
Threshold 1 shows a detection threshold that is set higher than any sample ever reaches. This would be of no
practical use, as it would never detect any positive samples, so has a sensitivity of 0%. However, it would
always identify all negative samples as negative, so would show 100% specificity.
Threshold 2 shows a lower detection threshold that detects all 8 strong positive samples. It does not detect
the 8 weak positive samples. This threshold gives a 50% sensitivity (8/16). It does not detect any of the
negative samples, so it still shows 100% specificity.
Threshold 3 shows a much lower detection threshold that detects all 8 strong positive samples, and 7 out of
8 of the weak positive samples. One of the weak positive samples gives a false negative result, as it falls below
the threshold. The sensitivity is 94% (15/16). However, the test also detects one of the 8 negative samples,
giving a false positive with this sample. It shows 88% specificity (7/8).
Threshold 4 shows a detection threshold that all samples (even negative samples) will exceed. Similarly to
threshold 1, this would be of no practical use, as it would never report any negative samples correctly, so has
a specificity of 0%. However, it would always identify all positive samples as positive, so would show 100%
sensitivity.
Choosing the threshold for a test is always a compromise between sensitivity and specificity, as shown in the simplified
example in Figure 2. In this example, the best threshold to choose would be threshold 3, as this is the compromise
that would detect most true positives (high sensitivity) while producing relatively few false positives (high specificity).
Samples that test close to this threshold will give sporadic results if repeated they may sometimes fall above or
below the threshold because of variation in sampling or processing. Repeating these samples in multiple tests allows
a consensus result to be generated, reducing the chance of false negatives.
12
References
1. Smith LE, Mottershaw AL, Egan M, Waller J, Marteau TM, Rubin GJ. The impact of believing you have had
COVID-19 on behaviour: Cross-sectional survey. medRxiv. 2020.
https://www.medrxiv.org/content/10.1101/2020.04.30.20086223v1
2. Ibid
3. Lighthouse Laboratory EQA performance and https://www.eurosurveillance.org/content/10.2807/1560-
7917.ES.2020.25.27.2001223#html fulltext
4. Fowler et al. (2020). A reverse-transcription loop-mediated isothermal amplification (RT-LAMP) assay for the
rapid detection of SARS-CoV-2 within nasopharyngeal and oropharyngeal swabs at Hampshire Hospitals NHS
Foundation Trust. medRxiv, pre-print. doi: https://doi.org/10.1101/2020.06.30.20142935.
5. SPI-B Consensus Statement on Local Interventions. Presented to SAGE 30 July 2020.
https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment data/file/90938
3/s0659-spi-b-consensus-statement-local-interventions-290720-sage-49.pdf
6. Grassly et al. Lancet Infect Dis, 2020. https://doi.org/10.1016/S1473-3099(20)30630-7
7. https://www.gov.scot/publications/coronavirus-covid-19-social-care-staff-support-fund-
guidance/pages/fund-criteria/
8. Larremore et al, 2020, MedRxiv
9. Kucirka et al. (2020). Variation in False-Negative Rate of Reverse Transcriptase Polymerase Chain Reaction
Based SARS-CoV-2 Tests by Time Since Exposure. Annals of Internal Medicine, https://doi.org/10.7326/M20-
1495
10. Sudlow et al. (2020) Testing for coronavirus (SARS-CoV-2) infection in populations with low infection
prevalence: the largely ignored problem of false positives and the value of repeat testing. medRxiv, pre-print,
doi: https://doi.org/10.1101/2020.08.19.2017813 (for interactive tool see Ref 11 below)
11. Sudlow et al. (2020) Interactive tool on false positive tests: https://www.hdruk.ac.uk/projects/false-
positives/