Research Article
Open Access
A Bootstrap Prediction Confidence Band for
QMRA Beta-Poisson Dose-Response Models
Gang Xie1,2*
1School of Medicine, Griffith University, Queensland, Australia
2Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Queensland, Australia
*Corresponding authors address: Gang Xie, Adjunct Research Fellow, School of Medicine, Griffith University, Queensland 4111, Australia, Tel.: +61-7-37357485; Fax: +61-7-37355318; E-mail:
g.xie@griffith.edu.au
Received: May 05, 2017; Accepted: May 23, 2017; Published: May 26, 2017
Citation: Gang Xie (2017) A Bootstrap Prediction Confidence Band for QMRA Beta-Poisson Dose-Response Models. Int Struct Comput Biol 1(1): 1-11.
Let
denote the probability of infection at a given mean dose
d . In the quantitative microbial risk assessment (QMRA) framework,
the beta-Poisson dose-response model
where
denotes the Kummer confluent hypergeometric
function and α,β are the model parameters, remains the most popular
plausible dose response model in practice. One commonly accepted
way of constructing the confidence band about the dose-response
curve in QMRA literature is to follow a bootstrap procedure based
on the maximum likelihood estimates of the model parameters α and
β. Here, it is shown that this bootstrap confidence bands reported in
the literature represent the confidence intervals for the mean value
of the probability of adverse effect, (i.e.,
which may represent
the probability of infection at mean dose levels), not the confidence
intervals for prediction. Therefore, the existing literature bootstrap
(95%) confidence bands normally contain far less than 95% data
points. In this study, a sample-size-dependent bootstrap (95%)
confidence band for prediction is proposed which can normally
contain 95% of the data points. Comparisons between the existing
and the newly proposed confidence band algorithms were made
through a comprehensive Monte Carlo simulation study by applying
both algorithms to four well studied experimental dose-response
datasets. The comparison results showed that the confidence band
for prediction is a better representation of the degree of uncertainty of
an estimated dose-response relation. The upper limit of the estimated
prediction confidence band could be used as a better (i.e., more
conservative but sensible) estimate for the worst case scenario in risk
assessment.
Key words: QMRA; beta-Poisson dose-response model;
bootstrap; confidence intervals for prediction; Monte Carol simulation.
Introduction
QMRA [1, 2] has provided a valuable alternative
quantitative framework for analysis of adverse health outcomes
associated with the environmental exposures to pathogenic
organisms [3, 4] and is widely used by researchers to characterise
microbial risks associated with food, water, and wastewater
use in agriculture (e.g., [1, 3, 5]). The core part of the QMRA
framework is the dose-response analysis which models the
mathematical characterization of the relationship between the
dose administered and the probability of adverse effect (typically,
the probability of infection) in the exposed population [1, 2, 6].
Among different microbial dose-response models proposed
in the literature, the beta-Poisson models remain the most
popular plausible models in practice [1, 2, 7].The beta-Poisson
dose-response model and its special case, the exponential doseresponse
model have been well studied and widely employed to
characterize infectivity of various viral, bacterial, and protozoan
pathogens since the 1980s [1, 2, 7, 8]. Because a beta-Poisson
model has two parameters, a confidence band about the doseresponse
curve cannot be calculated directly from the confidence
intervals of the parameter estimates. It is well accepted in QMRA
practice to employ a parametric bootstrap algorithm [9] to
construct a confidence band for the dose-response curve[1, 2, 8,
10].The upper limit of the estimated confidence band is used as
the worst case scenario in risk assessment.
Here, it is shown that the (95%) bootstrap confidence
bands reported in microbial risk literature normally contain far
less than 95% of the data points because they are not confidence
bands for prediction. This implies that the existing literature
confidence band may not be a good representation of the degree
of uncertainty of an estimated dose-response relation. To fill this
knowledge gap in QMRA analysis, a new, sample-size-dependent
bootstrap (95%) confidence band construction algorithm is
proposed which can normally contain 95% of the data points and
the results are validated by an extensive Monte Carlo simulation
study using four well studied experimental dose-response
datasets.
The rest of the paper is organized as follows. Section
2 begins with a description of the general setting for this study
which includes: the notation and definitions; datasets; model
specification and parameter estimation. The existing and the
new bootstrap confidence band algorithms are specified in
details in Section .The comparisons between the existing and the
new algorithms were made and the results are presented and
discussed in the last part of Section 2. The paper is completed with
a conclusion section. For a better reading flow and repeatability
of our research findings, an appendix section is added at the end
which contains: table 4, the details of the selected real doseresponse
datasets; table 5, the core parts of R code programs
[11] for random sample generation from a beta-Poisson doseresponse
model.
Data, model specification and parameter
estimation
Dataset, Notation and Definition
Well-studied dose-response experiment datasets
provide a reliable baseline for validation of our research findings.
In this study, four experimental datasets from the literature are
analysed for comparison between the single-hit beta-Poisson
model
and the generalized beta-Poisson model
with respect to parameter estimation and model
performance. The selected experiment datasets are: rotavirus
(CJN) and infection in healthy volunteers (Dataset 1) [8, 10];
Campylobacter and infection in healthy volunteers (Dataset 2)
[8, 10];
Salmonella (Nontyphoid Strains) and infection in human
volunteers (Dataset 3) (p399,[12]);
Listeria monocytogenes
and infection in mice (Dataset 4)[2]. These datasets have been
used for dose-response analyses in [2, 8, 10, 12], although these
references are not the original data sources. These four datasets
are chosen because of their variety of coverage, e.g., virus and
bacteria, few data points (e.g., ≤ 10) and many data points (> 40),
human volunteers and mice, and old (back to 1996) and recent
research sources. For the purpose of a clear and consistent model
specification and easy comparison with the literature results, we
have adopted a notation scheme as detailed in table I throughout
this paper. The term ‘organism’ is used as a short name for
pathogenic microorganism.
Model Specification
Beta-Poisson dose-response models are derived based
on a plausible conceptual process with the following assumptions
[14]: (1) only one viable organism is required to initiate the
infection process in vivo; (2) the exact number of organisms
being ingested in each dose representing a random sample from
a Poisson distribution (with a known mean number d); (3) the
survival of any organism within a single host is independent
survival of any other organism within that host.
A detailed description and derivation of the exponential
and the beta-Poisson dose-response models are given by Haas
et al..[1] For the purpose of this paper, the following model
specifications are needed.
If we assume that the actual number of organisms
ingested, D, follows a Poisson distribution, i.e.,D~ Poisson
a constant (a homogeneous susceptibility assumption); and K
min
= 1 (a single-hit assumption), the simplest QMRA dose-response
where is the mean/effective dose.
In Equation (1), if we take into account the individual
susceptibility/host sensitivity by allowing rtofollow a beta
distribution with the probability density function (pdf)
where
and parameters
and
we obtain the
well-known beta-Poisson (single-hit) dose-response model[1]:
where
is the Kummer confluent hypergeometric
function [15] defined by
Since there are no analytic solutions to
only
numeric approximation solutions or asymptotic solutions can be
obtained[15, 16]. In their original paper, Furumoto and Mickey
[17] derived the simple, attractive approximation beta-Poisson
dose-response formula:
When
and
obtained from Equation (5) is a
very good approximation to the true values of
defined by Equation (3) [1, 10]
A single-hit model reaches its maximum risk limit when
r=1 in Equation (1). This is an important property of the beta-
Poisson model (Equation (3)) which provides an upper bound
of the confidence level of the dose-response relation. It is worth
noting that this maximum risk limit property is not retained in
the approximate beta-Poisson model (Equation (5)) [10]
Model Parameter Estimation and Statistical analysis
Let denote the value of a ─2 log-likelihood ratio and
we term this the
deviance. According to [1](p285, Equation 8.33),
for dose-response data analysed in this study,
Y can be calculated
as
where
is the predicted response, typically the predicated
probability of infection;m is the number of population groups at
risk in terms of the mean dose levels as defined in Table 1. The
true response (i.e., the true probability of infection) is estimated
by
the ratio of the observed number of infected
individuals to the number of all individuals at risk at each mean/
effective dose level
Table 1: Notation and definition
Notation |
Definition |
Ds=
|
mean dose levels (i.e., the average number of organisms per dose or average concentrations); where the meaning is clear, denote
for
. |
N =
|
is the number of individuals participated in the i th exposure group (
groups, each corresponding to one mean dose level
). |
y =
|
is the number of infected individuals in the th exposure group |
r |
survival probability of each single organism ingested |
K |
number of organisms surviving to the target site inside a host body |
Kmin |
the threshold value of the surviving organisms, i.e., the minimum number of organisms required for causing an infection event |
|
probability of infection estimated by a dose-response model |
|
observed proportion of infected individuals in the i th exposure group, i.e.,
|
|
predicted binary response by a model, e.g., if response is defined by probability of infection,
. |
exp(.) and log |
exp(1) =
is the natural logarithm base such that log(e) = 1. |
|
the gamma function [13] |
Equation (6) is a measure of deviance and can be used
as an objective function for minimisation in model parameter
estimation [18].The results are the maximum likelihood estimates
(MLE) [1, 19]. Note that
defined in Table 1.Then, the
MLE of the model parameters, α ̂and β ̂, can be obtained using
Equation (3) for an exact beta-Poisson model or Equation (5)
for an approximate beta-Poisson model. The evaluation of the
Kummer confluent hypergeometric function (Equation (4)) could
be non-trivial. The statistical data analyses in this study were
conducted using the open source statistical software R [11]. The
R function ‘hyperg_1F1’ in the R package ‘gsl’ [20] is employed
for evaluation of Equation (4). The built-in optimization function
‘optim’ in R is employed to obtain α ̂ and β ̂ by minimizing the
model deviance
Y defined by Equation (6).
A new Bootstrap confidence band Algorithm
A bootstrap approach is commonly accepted for
construction of confidence bands on dose-response curves in
microbial risk literature [1, 2, 8, 10]. Following the descriptions
given in Haas et al. [1]and in Teunis et al. [8], the bootstrap
algorithm employed in the literature (the existing algorithm) is
summarized in the left column in Table 2. The new bootstrap
algorithm, which produces sample-size-dependent confidence
bands for prediction, is summarized in the right column of the
same table. The comparison of both algorithms is depicted
through figures1 to 7 for the four selected datasets.
Table 2: Parametric bootstrap algorithms for construction of 95% confidence bands about the mean dose-response curve
Existing bootstrap algorithm (confidence band for means) |
New bootstrap algorithm (confidence band for prediction) |
Step 1 |
Generate a large number of simulation samples, e.g., N*=2000, based on Ds, N,
; Determine the plotting mean dose levels (Dsim). |
Step 1 |
Determine at which mean dose levels (Dsim) and the number of people (Nsim={max(N)}) exposed to hazard for generating simulation samples; |
Step 2 |
Fit a beta-Poisson model to the generated simulation sample so that 2000 sets of
are obtained; |
Step 2 |
Generate a large number of simulation samples, e.g., N*=2000, based on Dsim, Nsim,
; |
step 3 |
Calculate the probability of infection based on Equation (3) or (5) using parameter estimates obtained in Step 2 (2000 values at each Dsim level ); |
Step 3 |
Calculate the probability of infection using the ratio ysim/Nsim, where ysimare the simulation sample values from Step 2 (2000 values at each Dsim level); |
Step 4 |
Obtain a 95 percentile (2.5% to 97.5%) confidence interval for each Dsim level using the results from Step 3; connect all lower/upper bound points to form a confidence band about the infection probability curve. |
Step 4 |
Obtain a 95 percentile (2.5% to 97.5%) confidence interval for each Dsim level using the results from Step 3; connect all lower/upper bound points to form a confidence band about the infection probability curve. |
Note that the generation of random samples of
ysim
given
Ds =
(or
Dsim) and
N=
is essential for employing a bootstrap
algorithm to construct confidence bands. For a selected beta-
Poisson dose-response model, the parametric bootstrap [9]
random samples, ysim, is generated using R code programs
provided in table 5.
The two different approaches to estimate the
probabilities of infection,
have a substantial impact on
the resulting confidence bands. With respect to each bootstrap
sample (i.e., the simulated data) ysim, the parameter estimates
and
are obtained first and the existing algorithm then
estimates
using Equation (3) (for an exact beta-Poisson
model) or Equation (5) (for an approximate beta-Poisson model),
whereas the new algorithm directly calculates the ratio ysim/N
sim
as the model estimates for
.
Although both algorithms construct the parametric
bootstrap confidence bands, it is important to realize that the
band based on the existing algorithm counts for the variation
of the probability of infection given a mean dose level while the
band based on the new algorithm counts for the variation for the
probability of infection given a specific dose.
Therefore, the difference between these two confidence
bands is an analogy of the difference between confidence bands
for a regression line and for prediction of a new data point in a
linear regression model. Furthermore, these two algorithms are
fundamentally different in evaluating the discrepancy between
Equation (3) and Equation (5): the existing algorithm treats the
discrepancy as a difference caused by two different dose-response
models (i.e., the exact versus the approximate models) while the
new algorithm treats the discrepancy as a difference caused by
two different parameter sets under the exact beta-Poisson doseresponse
model.
Because the new algorithm employs the proportion
data as the estimates for
significant data discretization
effects are expected in the resulting confidence bands when the
maximum number of
(denoted by max(
N)) is relatively small
(e.g., max(
N) < 15). Because the observed proportion of infection
where
is used as the optimization target in the
parameter estimation process
for Equations (3) and (5), the (median) curve estimated using the
new algorithm should converge to the analogous value using the
existing algorithm for large max(N), e.g., max(N) > 50. Whereas
the large sample assumption is implicitly assumed in the existing
algorithm, the new algorithm is critically dependent on the choice
of N
sim. Smaller (larger) N
sim would result in a wider (narrower)
confidence band. With respect to each mean dose level
the number of individuals exposed to a hazard,
(or N
sim) may be considered as a sample size in a binomial
process. In this sense, the new algorithm confidence bands are,
therefore, sample-size-dependent (i.e. N
sim =max(N)-dependent)
These theoretical properties are examined in the Figures
presented in next section.
Comparison Results and Discussions
Results and Discussions for dataset 1
Using the rotavirus dataset (Dataset 1) as an example,
the data and parameters used in both bootstrap algorithms for
confidence band construction are listed in Table 3.Figures1 and
2 are created based on Table II (bootstrap algorithm procedures)
and table 3 (input information).
Table 3: Data and parameters used for bootstrap confidence band
construction in Figures1 and 2
Ds |
{ 0.009, 0.09, 0.9, 9, 90, 900, 9000, 90000} |
N |
{5, 7, 7, 11, 7, 8, 7, 3} |
y |
{0, 0, 1, 8, 6, 7, 5, 3} |
Approximate B-P model: |
= 0.253;
= 0.422 |
Exact B-P model: |
= 0.167; = 0.191 |
Dsim |
{0.005, 0.009, 0.05, 0.09, 0.5, 0.9, 5, 9, 50, 90, 500, 900, 5000, 9000, 50000, 90000} |
Nsim=max(N) |
{11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11 } |
In figure1, within the observed mean dose level range,
the smooth confidence bands (dot-dashed lines in green colour,
based on the existing algorithm) are consistently and noticeably
narrower than the less smooth confidence bands (dotted lines in
black colour, based on the new algorithm) while the median lines
(solid lines in green or black colour) match well with each other.
Table 4: Selected dose- response experiment datasets from the literature
Dataset 1: Rotavirus (CJN) and infection in healthy volunteers [8, 10] |
Ds (mean dose) |
N (total) |
y(infected) |
9×10-3 |
5 |
0 |
9×10-2 |
7 |
0 |
9×10-1 |
7 |
1 |
9 |
11 |
8 |
9×101 |
7 |
6 |
9×102> |
8 |
7 |
9×103 |
7 |
5 |
9×104 |
3 |
3 |
Dataset2: Campylobacter and infection in healthy volunteers [8, 10] |
Ds(mean dose) |
N(total) |
y(infected) |
8×102 |
10 |
5 |
8×103 |
10 |
6 |
9×104 |
13 |
11 |
8×105 |
11 |
8 |
1×106 |
19 |
15 |
1×108 |
5 |
5 |
Dataset3: Salmonella (Nontyphoid Strains) and infection in human volunteers (p399,[12]) |
Ds(mean dose) |
N(total) |
y(infected) |
1.52×105 |
6 |
3 |
3.85×105 |
8 |
6 |
1.35×106 |
6 |
6 |
1.39×105 |
6 |
3 |
7.05×105 |
6 |
4 |
1.66×106 |
6 |
4 |
1.5×107 |
6 |
4 |
1.25×105 |
6 |
5 |
6.95×105 |
6 |
6 |
1.7×106 |
6 |
5 |
1.2×104 |
5 |
2 |
2.4×104 |
6 |
3 |
6.6×104 |
6 |
4 |
1.41×105 |
6 |
3 |
2.56×105 |
6 |
5 |
5.87×105 |
6 |
4 |
8.6×105 |
6 |
6 |
8.9×104 |
6 |
5 |
4.48×105 |
6 |
4 |
1.04×106 |
6 |
6 |
3.9×106 |
6 |
4 |
1×107 |
6 |
6 |
2.39×107 |
6 |
5 |
4.45×107 |
6 |
6 |
6.73×107 |
8 |
8 |
1.26×106 |
6 |
6 |
4.68×106 |
6 |
6 |
1.2×104 |
6 |
3 |
2.4×104 |
6 |
4 |
5.2×104 |
6 |
3 |
9.6×104 |
6 |
3 |
1.55×105 |
6 |
5 |
3×105 |
6 |
6 |
7.2×105 |
5 |
4 |
1.15×106 |
6 |
6 |
5.5×106 |
6 |
5 |
2.4×107 |
5 |
5 |
5×107 |
6 |
6 |
1×106 |
6 |
6 |
5.5×106 |
6 |
6 |
1×107 |
6 |
5 |
2×107 |
6 |
6 |
4.1×107 |
6 |
6 |
1.5×106 |
6 |
5 |
7.68×106 |
6 |
6 |
1×107 |
6 |
5 |
1.58×105 |
6 |
1 |
Dataset4: (292 and 295 pooling) Listeria monocytogenes and infection in mice [2] |
Ds(mean dose) |
N(total) |
y(infected) |
2 |
6 |
0 |
5 |
6 |
1 |
1.1 102 |
6 |
2 |
5.5×103 |
10 |
7 |
3.24×104 |
10 |
7 |
3.9×104 |
6 |
4 |
5.5×104 |
10 |
9 |
2.51×105 |
10 |
10 |
5.5×105 |
10 |
10 |
2.82×106 |
10 |
10 |
Ds= {D_1,D_2,…,D_m } representsm mean dose levels (i.e., the average number of organisms per dose or average concentrations) applied tom exposure groups; N
= {N_1,N_2,…,N_m }where N_i represents the number of individuals participated in the ith exposure group (corresponding to mean dose level D_i); y = {y_1,y_2,…
,y_m }where y_irepresents the number of infected individuals in the ith exposure group.
Table 5:
This is the most important feature to note in the comparison of
these two algorithms. With the new algorithm, the resulting
confidence band is critically dependent on the choice of Nsim. By
choosing Nsim= max(N) as proposed in table II and exemplified in
table 3, the confidence band based on the new algorithm provides
a lower bound of the band-width in estimation of the prediction
confidence levels (i.e., the confidence band for prediction should
be at least as wide). As expected, in all datasets we examined, the
existing algorithm band was narrower than the analogous band
based on the new algorithm within the observed mean dose
level range. This provides strong empirical evidence to confirm
the theoretical property concluded in the last section that the
confidence bands based on the algorithm currently proposed in
the literature are not confidence bands for prediction.

Figure 1: Comparison of bootstrap confidence bands (dataset 1). Plotting scheme: circles represent the data points; solid lines depict the estimatedmedian
, and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when N
sim→ ∞).

Figure 2: Comparison of bootstrap confidence bands (dataset 1, approximate beta-Poisson model). Plotting scheme: circles represent the data points; solid lines depict the estimated median
, and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when N
sim→ ∞); the thin red dashed line is the maximum risk curve in finite sample case (i.e., when N
sim is finite), a bootstrap 97.5 percentile line.
Another important feature to note in Figure1 is that,
except for the confidence band of the exact beta-Poisson model
based on the existing algorithm (the green dot-dashed lines in
figure 1(b)), all three other confidence band upper limit lines
exceed the maximum risk line (bold red dashed line) in the verylow
dose region (the bottom left corner).This exceedance issue is
further illustrated and discussed in figure2.
An approximate beta-Poisson model can perform poorly
in that its predicted P_I (d) curve may go above the maximum risk
line (the bold red dashed line); figure1(a) essentially reproduced
the result reported by Teunis in 2000[10]. Figure 2(a) is a
reproduction of figure 1(a) but with one extra curve added in –
the thin red dashed line in the farthest left part of the plot.
It is noted that, in the bottom left corner region of figure
2(a), the upper dotted line exceeds the upper dot-dashed line
which is already above the bold red dashed line. However, the
dotted line remains under the thin red dashed line. Therefore,
the confidence band based on the new algorithm (dotted
lines) does not fail here, because it is a sample-size dependent
confidence band and the thin red dashed line is the finite sample
maximum risk line (choosing the same Nsim values, as shown in
table 3, to take account of the sampling variation in estimating the
maximum risk level). On the other hand, the existing algorithm
for computing the approximate beta-Poisson confidence band
(dot-dashed lines) does violate the maximum risk limit for MLEs
are obtained under the large sample assumption and the bold
red-dashed line (Nsim→ ∞) applies as the maximum risk limit.
Figure 2(b) is drawn to verify how the new algorithm confidence
band will change as Nsim changes (Nsim = {200}, compared with
Nsim = {11} in figure 2(a)). Figure2(b) shows that the thin red
dashed line moves much closer to the bold red dashed line and
the new algorithm band (dotted lines) shrinks significantly, with
the upper dotted line being always under the thin-red dashed
line. It is noted that the (existing algorithm) upper dot-dashed
line even exceeds the thin red dashed line in figure 2(b) which
confirms the finding reported by Teunis [10]: the maximum risk
limit property is not retained in the approximate beta-Poisson
model. Noting the fact that N_i is seldom more than 20 in the real
dose-response data, it is, therefore, necessary to take into account
the sampling variation effects in determination of the maximum
risk limits using Equation (1).
As shown by figure 1(b), by ignoring the sampling
variation effects, the existing algorithm could underestimate
the maximum risk in the very low dose level range (e.g., mean/
effective dose d< 0.05or log(mean/effective dose) = -3 ) which is
of particular interest (and concern) in microbial risk assessment
practice.

Figure 3: Comparison of bootstrap confidence bands (dataset 2). Plotting scheme: circles represent the data points; solid lines depict the estimated median
, and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the
new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when N
sim→ ∞).
Results and Discussions for datasets 2, 3, and 4
Figures 3 through to 7 are produced for the analysis of
datasets 2, 3, and 4 following the same procedures as exemplified
with dataset 1. Similar to figure1, figure 3 shows that the median
PI (d) lines match very closely for both algorithms and the black
dotted lines (the new algorithm) form a wider confidence band
than the band formed by green dot-dashed lines (the existing
algorithm) over the observed mean dose level range. In contrast
toFigure1, however, data points in figure 3 only spread in the
high dose level range (log(mean dose) > 6.5). In figure 3, the new
algorithm band is much narrower than the existing algorithm
band outside the observed mean dose range (roughly where
log(mean dose) < 6.5). It is also noted that the new algorithm
bands are well under the limiting maximum risk (bold red
dashed) line. In contrast, the approximate beta-Poisson model
confidence band based on the existing algorithm (figure 3(a))
badly exceeds the maximum risk line in the lower mean dose
level range (roughly where log(mean dose) < -2). With a 95%
confidence band, we expect to observe, on average, only one out
of 20 data points falling outside the band. However, we notice
that, with six data points, the existing algorithm confidence band
clearly missed one point and another point is only marginally
included. In contrast, all six data points are well included in the
new algorithm band.
Figures 4 and 5 are drawn in order to show the impacts
of the sampling variation on the new algorithm confidence band
while the existing algorithm band should remain unchanged.
The sampling variation is determined by the sample size Nsim.
The plots in figure 4(a), (b), (c), and (d), show how the bands,
constructed based on the new algorithm, change as the sample
size Nsim changes.

Figure 4: Comparison of bootstrap confidence bands by setting N
sim to different values (dataset 2, approximate beta-Poisson model). Plotting scheme: circles represent the data points; solid lines depict the estimated median
, and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines
(the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum
risk curve (i.e., when N
sim→ ∞); the thin red dashed line is the maximum risk curve in the finite sample case (i.e., when N
sim is finite), a bootstrap 97.5
percentile line.

Figure 5: Comparison of bootstrap confidence bands (dataset 2, exact beta-Poisson model, N
sim={2} ). Plotting scheme: circles represent the data points; solid lines depict the estimated median
, and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm
in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when
N
sim→ ∞).
Note that the thin red dashed lines move away / diverge from the
bold red dashed lines as the sample size decreases. Figure5 is with
the Nsim = {2} setting, i.e., it is assumed that only two subjects are
exposed to hazard at each mean dose level. Settings in Figures4
and 5 are supposed to cover most of the real life experimental
scenarios. Figure4 examines the approximate beta-Poisson
model situation and Figure5 examines the exact beta-Poisson
model situation. All these cases have shown a consistent pattern
– the existing algorithm confidence band is possibly too wide
outside the observed mean dose level range. Since the parameter
estimation results are
(approximate beta-
Poisson model [10]) for dataset 2. This implies that the confidence
band for dataset 2, no matter whether the parameter estimates
are obtained from Equation (3) or Equation (5), should almost
always be valid. While this numeric evidence is consistent with the
new algorithm bands as shown in figures3, 4, and 5, the existing
approximate beta-Poisson model bands clearly contradict this
expectation. The result of the deviance difference analysis (not
reported in this article) also suggested that, with dataset 2, the
approximate beta-Poisson model is a very good approximation.
All these assessments strongly suggest that the confidence band
based on the existing algorithm may not be a proper estimation of
the degree of uncertainty of the dose-response relation in these
circumstances. Figures 6 and 7, which present the confidence
band comparison for datasets 3 and 4, provide more evidence
to show the comparatively poor performance of the confidence
bands based on the existing algorithm.

Figure 6: Comparison of bootstrap confidence bands (dataset 3, exact beta-Poisson model). Plotting scheme: circles represent the data points; solid lines depict the estimated median
,and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour)
or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when N
sim→ ∞)

Figure 7: Comparison of bootstrap confidence bands (dataset 4, exact beta-Poisson model). Plotting scheme: circles represent the data points; olid lines depict the estimated median
,and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour)
or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when N
sim→ ∞).
Figure 6 (a) is essentially a reproduction of figure 9-2 in
[12] (p402) and figure 6(b) shows the corresponding confidence
band based on the new algorithm. It is obvious that only a
minority (roughly 10 out 47) data points are included in the
existing algorithm band while the new algorithm band contains
45 out of 47 data points as expected. Similar comparisons can be
made for dataset 4 as shown in figure 7(a) and (b) -- 5 out of 10
data points are outside the existing algorithm band while all data
points are contained in the new algorithm band.
The literature dose-response analysis results of dataset
4 can be found in [2].
http://qmrawiki.msu.edu/index.php?title=Listeria_monocytoge
nes_%28Infection%29%3A_Dose_Response_Models
(accessed on 23/04/2017). In the analysis, the
author(s) compared the fit of the beta-Poisson and exponential
models to the data and concluded that the beta-Poisson model
was a adequate model. Figure 7 depicts the results of the doseresponse
analysis with dataset 4. Clearly, the new algorithm has
provided a much better fit.
Conclusions
A new parametric bootstrap confidence band algorithm
is proposed for the popular QMRA beta-Poisson dose-response
model in this study. The resulting confidence band is sample-size
dependent (i.e., Nsim-dependent)and it provides the lower bound
of the band-width for estimating the true confidence level for
prediction. Through an intensive Monte Carlo simulation study,
the performances of the confidence bands based on the existing
literature algorithm and the new algorithm are compared by
fitting the beta-Poisson dose-response models to four well
studied datasets. It has been shown that, asymptotically (i.e., as
Nsim→ ∞) the estimated median dose-response curve based on the
new algorithm converges to the existing algorithm median curve;
in finite Nsim case, these two curves differ only in smoothness
(the existing algorithm curve is always smoother). The new
algorithm confidence band substantially improves the existing
algorithm band in data point coverage. Due to the big sampling
variation in proportion data under small sample conditions,
the existing algorithm is likely to underestimate the maximum
risk limit of dose-response data. On the other hand, outside the
observed mean dose level range, the existing algorithm tends to
overestimate the upper bound of the confidence band. This study
has shown that the new algorithm is able to facilitate both types
of bias. Therefore, the confidence band for prediction based on
the new algorithm is a better representation of the degree of
uncertainty of an estimated dose-response relation. The upper
limit of the estimated prediction confidence band could be used
as a better estimate for the worst case scenario in risk assessment.
Acknowledgements
The early version of this article was finished in 2015
when the author worked for the pond project which was funded by
Department of Science, Information Technology and Innovation
(DSITI) of Queensland government, Australia. The author would
like to thank Professor Kerrie Mengersen (Queensland University
of Technology) for her valuable feedback on the early version of
the draft manuscript.
- Haas CN, JB Rose,CP Gerba. Quantitative Microbial Risk Assessment. second ed. New York: John Wiley & Sons;INC2014.
- Rose J. RA Wiki 23 January, 2015 [cited 2017 February]; Quantitative Microbial Risk Assessment (QMRA) Wiki] Available from: http://qmrawiki.canr.msu.edu/index.php/Quantitative_Microbial_Risk_Assessment_%28QMRA%29_Wiki
- Mara DD. Sleigh PA, Blumenthal UJ, Carr RM. Health risks in wastewater irrigation: comparing estimates from quantitative microbial risk analyses and epidemiological studies. J Water Health. 2007;5(1): 39-50.
- Karavarsamis N, AJ Hamilto. Estimators of annual probability of infection for quantitative microbial risk assessment. Journal of Water and Health, 2010;8(2): 365-373.
- Teunis P, Takumi K, and Shinagawa K. Dose response for infection by Escherichia coli O157: H7 from outbreak data. Risk Analysis. 2004;24(2): 401-407.
- Xie G, Stratton H, Lemckert C, Dunn PK, Mengersen K. A Generalized QMRA Beta‐Poisson Dose‐Response Model. Risk Analysis, 2016;36(10):1948-1958. doi:10.1111/risa.12561
- The Interagency MRA Guideline Workgroup, Microbial Risk Assessment Guideline, Pathogenic Microorganisms with Focus on Food and Water. 2012, U.S. Environment Protection Agency (EPA) and U.S. Department of Agriculture/Food Safety and Inspection Service (USDA/FSIS).
- Teunis P, O G. van der Heijden, JWB van der Giessen, A H Havelaar. The dose-response relation in human volunteers for gastro-intestinal pathogens. National Institute of Public Health and the Environment (RIVM); The Netherlands:1996.
- Efron B. RJ. Tibshirani. An introduction to the bootstrap. Vol. 57;1994: CRC press.
- Teunis PF, Havelaar AH. The Beta Poisson dose‐response model is not a single‐hit model. Risk Analysis. 2000;20(4): 513-520.
- R Development Core Team. R: A Language and Environment for Statistical Computing. 2014. Available from: http://www.R-project.org
- Haas CN, JB Rose, CP Gerba. Quantitative Microbial Risk Assessment. first ed. New York:John Wiley & Sons, INC;1999.
- Kreyszig E. Advanced engineering mathematics. 10th edition ed:John Wiley & Sons;2011.
- Haas CN. Conditional dose-response relationships for microorganisms: Development and application. Risk analysis. 2002; 22(3):455-463.
- Muller KE, Computing the confluent hypergeometric function, M (a, b, x). Numerische Mathematik. 2001;90(1): 179-196. doi:10.1007/s002110100285
- Butler RW, Andrew T. A. Wood. Laplace Approximations for Hypergeometric Functions with Matrix Argument. The Annals of Statistics. 2002;30(4):1155-1177.
- Furumoto WA, Mickey R. A mathematical model for the infectivity-dilution curve of tobacco mosaic virus: Theoretical considerations. Virology. 1967;32(2):216-223. doi:10.1016/0042-6822(67)90272-3
- McCullagh P, J Nelder. Generalized linear models. Monographs on statistics and applied probability. Second edition London; Chapman and Hall:1989.
- Morgan B.J. Analysis of quantal response data. Vol. 46. 1992: CRC Press.
- Hankin, RK. Package ‘gsl’. 2013.