Open Access

Steady state trials: another valid substitution of counterfactual ideal to measure causal effects

Environmental Health and Preventive Medicine201218:312

Received: 18 July 2012

Accepted: 9 October 2012

Published: 31 October 2012



Many traditionally established medical interventions are not examined with randomized trials especially in emergency medicine. We researched what is the scientific basis of the measurement of the causal effect in these interventions and proposed another trial to measure causal effects.


We deduced steady state trials from the counterfactual model and used Bayesian approaches to estimate causal effects statistically.


When the state of the observed person is fairly steady before an exposure, the ratio of the after-period to the before-period of the exposure is sufficiently small, and changes are obtained in relatively short time, it is possible to postulate that the state of the counterfactual person to be compared is almost equal to the state of the real person before the exposure. Bayesian approaches show that the causal effect of the exposure is estimated even in only one-person steady state trials, when large changes are observed.


Steady state trials are valid methods to measure causal effects and can measure causal effects even in one-person trials. When we can measure the causal effect of interventions with steady state trials, these interventions should be regarded as scientific without use of randomized trials.


Cross-over trials Counterfactual model Steady state Period ratio Individual causal effect


Evidence-based medicine (EBM) appeared as a handy tool kit for clinicians who had not understood the basic thinking of epidemiology [1]. After the advocates of EBM succeeded in nominating randomized trials to be paramount [2], the so-called “Hierarchy of Strength of Evidence” towered in medical practice and many clinical guidelines prostrated themselves in front of the pyramid [3, 4]. Many traditionally established medical interventions were stripped of their rank for reasons having to do with observational studies. Under these circumstances, Smith and Pell [5] asked a sarcastic question why protagonists of EBM did not participate in a randomized trial of parachute use.

In epidemiological studies, the counterfactual or potential-outcome model has become increasingly standard for causal inference [68]. However, the theoretical ideal to measure causal effects of exposure is impossible. To achieve a valid substitution for the counterfactual experience, we resort to various design methods that promote comparability. One approach is a cross-over study and another is a randomized trial. Other approaches might involve choosing unexposed study subjects who have the same or similar risk-factor profiles for disease as the exposed subjects [9]. Case-crossover design was introduced for estimating a short term, transient effect of intermittent exposures on acute-onset diseases [10, 11]. For each case, one or more predisease or postdisease time periods are selected as matched control periods for the case. The exposure status of the case at the time of the disease onset is compared with the distribution of exposure status for the same person in the control periods. The key feature of the case-crossover design is that each case serves as its own control. In this paper, we expand this key feature and propose another valid substitution of the counterfactual ideal to measure causal effects and show that parachute use and many interventions in emergency medicine have the scientific basis of the causal inference without randomized trials.

Materials and methods

We deduce steady state trials from the counterfactual model. The scheme is presented in Fig. 1. Bayesian methods are used to estimate causal effects statistically [12, 13] (see appendix). Posterior distributions are computed with WinBUGS version 1.4.3, which reports two-sided equi-tail-area credible intervals [14]. We use these intervals for convenience, although highest posterior density intervals are more preferable.
Fig. 1

Counterfactual model. We establish a hypothetical person in the counterfactual world in order to compare the outcome of the exposed person with the outcome of the unexposed person. After the exposure, both the conditions of the exposed person and the unexposed person are observed at the same time. As the only difference between the two settings is the exposure, it is possible to measure the effect of the exposure


Steady state trials

For the purpose of discussion, letters are defined as follows;


T 0

the time when the observation starts

T 1

the time when the exposure is done

T 2

the time when the outcome is observed

B = (T 1 − T 0): 

the period before the exposure

A = (T 2 − T 1): 

the period after the exposure


the integer which gives the ratio of A to B, A:B = 1:n


the state of the observed person which is a function of time


the state S just before the time T 1


the state S at the time T 2


the state of the counterfactual ideal of the unexposed person which is a function of time


the state Z at the time T 2.

Steady state trials begin with the observation of the state of the object person (Fig. 2). Suppose the state is almost steady during the period B (Fig. 3). Namely, the derivative of the state with respect to time during the period B is
$$ \frac{{{\text{d}}S}}{{{\text{d}}t}} = k + \delta , $$
where k is a constant and δ is noise which follows the normal distribution N(0, σ2). We observe the state S (n + 1)-times at the interval of the period A and obtain sample noises n-times (δ i ; i = 1, 2, …, n) during the period B. Just before the exposure, the state is recorded as X. When we observe Y at the end of the period A after the exposure, we get the mean value of \( \frac{{{\text{d}}S}}{{{\text{d}}t}} \) during the period A:
$$ \frac{{{\text{d}}S}}{{{\text{d}}t}} = (Y - X)/A. $$
Fig. 2

Steady state trials. S the state, X S just before the exposure, Y S at the end of the period A, W the counterfactual state at the end of the period A. S is steady during the period B

Fig. 3

Derivative of the state. Period A:Period B = 1:n, where n is a positive integer. \( \frac{{{\text{d}}S}}{{{\text{d}}t}} \) is the derivative of S with respect to time; \( \frac{{{\text{d}}S}}{{{\text{d}}t}} \) = k + δ, where k is constant and δ (δ i ; i = 1, 2,…, n) is noise which follows the normal distribution. We observe the state (n + 1)-times at the interval of the period A during the period B. The counterfactual derivative of the state is postulated as (k + δz), where δz follows the same distribution as δ is

When the ratio of the period A to the period B is sufficiently small, i.e., n is sufficiently large, we can postulate the derivative of the counterfactual unexposed state is k plus noise δz which follows the same distribution as δ is :
$$ \frac{{{\text{d}}Z}}{{{\text{d}}t}} = k + \delta z. $$

However, we cannot really observe \( \frac{{{\text{d}}Z}}{{{\text{d}}t}} \), so the value of (k + δz) is replaced with the observed value of (k + δ is ). In order to estimate the difference between (Y –X)/A and (k + δ is ), we postulate that the distribution of (Y – X)/A follows the normal distribution with the same variance as σ2 which is estimated by the sample variance of (k + δ is ). Then the difference between ( X)/A and (k + δ is ) can be statistically estimated with the t distribution. When the outcome Y has the quality different from the state X, the nominal scale is applied.

Statistical inference of causal effects

Suppose when we observe the outcome Y which belongs to the category different from the state X and the change is of practical importance, or when the difference between (– X)/A and (k + δ is ) is statistically significant and large enough to be of practical importance. We now discuss the causation of the incidence of such an important outcome Y, which we designate Y imp in the following discussion.

The probability that Y imp happens during the period A with the exposure is represented by the letter θe:
$$ \theta e = \, P(Y_{\text{imp}} \, | \, E, \, C), $$
where E is the exposure, C is the condition that the state is steady during the period B and the vertical line represents conditioning. We postulate that Y imp is a Bernoulli variable during the period A. The probability that Y imp happens during the period A without the exposure (¬E) is represented by the letter θu:
$$ \theta u = \, P(Y_{\text{imp}} \, | \, \neg E, \, C). $$

As the state is steady and the period A is small relative to the period B, we can postulate that the counterfactual condition of the unexposed state in the period A is equivalent to the real condition in the period B, and the probability that Y imp happens within the time span of the period A during the period B is equal to θu. Then the period B has a sequence of n-times repetitions of a trial with constant probability θu.

Suppose that we observe Y imp after the exposure and there is no incidence of Y imp during the period B in one steady state trial. The components of the Bayesian model for steady state trials can be written as follows:
$$ \begin{gathered} {\text{prior distribution}}\,\,\,\,\,\theta e\,\,\,\,\,\,\,\sim {\text{ Beta}}\left( {\alpha_{ 1} , \, \beta_{ 1} } \right) \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\;\;\;\;\theta u\,\,\,\,\,\,\,\sim {\text{ Beta}}\left( {\alpha_{ 2} , \, \beta_{ 2} } \right) \hfill \\ {\text{likelihood}}\,\,\,\,\,\,\,\,\,\,\,\,\,\;\;\;\;\;p\left( {{ye}|\theta e} \right) = \theta e \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\;\;\;\;\;\;\;\;p\left( {{yu}|\theta u} \right) \, = \left( { 1- \theta u} \right)^{\text{n}} \hfill \\ {\text{posterior distribution}}\,\, \, \Updelta = \theta e\left| {{ye} - \theta u} \right|yu, \hfill \\ \end{gathered} $$
,where ye is the success of Y imp in one trial under the exposure, yu is the no success of Y imp in n trials under the non-exposure, and Δ is the difference between θe and θu taking account of the trial evidence. Ideally, the likelihood of no success under the non-exposure should be computed with n-times trials in the real world and one-time trial in the counterfactual world. The trial in the counterfactual world cannot be observed. Thus, we approximately compute this likelihood with n-times trials in the real world. The posterior distribution Δ is computed with WinBUGS.

When we are uncertain about the prior distribution, we adopt Beta(0.5, 0.5) or Beta(1, 1) as the reference prior distribution for θe and θu. The posterior distribution of θu shifts to zero as n becomes larger. With the reference prior Beta(0.5, 0.5) for θe and θu, the lower limit of the 95 % credible interval of Δ is over zero when n is equal to or more than four. With the reference prior Beta(1, 1) for θe and θu, the lower limit of the 95 % credible interval of Δ is over zero when n is equal to or more than seven. Classical statistical approaches also show similar results [15, 16]. The larger the number of n is, the more credibility we can gain in the inference of the causal effect. However, the lower limit of the 95 % credible interval of Δ cannot be over 0.15 with the prior Beta(0.5, 0.5) and not over 0.16 with the prior Beta(1, 1), no matter how n may become large. This is the limitation of one-person trials. Population studies with many persons can show larger lower limits of the credible interval, if the success proportion is high.

Steady state implies that the previous observations of the same condition showed no incidence of Y imp without the exposure. When we believe the previous evidence for the no-incidence of Y imp under the non-exposure, we can adopt, for example, Beta(1, 1000000) as the prior distribution for θu. Adopting almost null prior distribution Beta(1, 1000000) for θu means that p(Δ) is practically equal to p(θe|ye).

Relation to cross-over trials

The simple cross-over design is outlined by Armitage et al. [17]. With two treatments, F and G, one randomly chosen group of patients receives treatments in the order FG, while the other group receives them in the order GF. The active response that is common to all subjects in a particular group and particular period with the treatment received is modeled as follows:



Period 1

Period 2

Group I (FG)

μ + τF + π1

μ + τG + π2 + γFG

Group II (GF)

μ + τG + π1

μ + τF + π2 + γGF

Here, μ is a general mean, the τ terms represent treatment effects, the π terms represent period effects, and the γ terms represent the treatment × period interaction.

When F is no treatment, τF and γFG are null and the model of the group I is as follows:



Period 1

Period 2

Group I (FG)

μ + π1

μ + τG + π2

Suppose the ratio of the period 2 to the period 1 is 1:n. The constancy of \( \frac{{{\text{d}}S}}{{{\text{d}}t}} \) means that the period effect π1 is constant during the period 1. Under the condition of state steadiness which is confirmed by the (n + 1) times observations during the period 1, when the period 2 follows successively the period 1 and n is sufficiently large, we can postulate that the π2 is almost equal to (π1 + π1/n) in the group I. The larger n is, the more we can believe the steadiness of the state and the approximation of π2. Then the difference of the response between the two periods is (τG + π1/n) and we can measure τG with the repeated observations of group I. Thus steady state trials are considered as variants of cross-over trials.

Prerequisite for steady state trials

How many figures should we adopt for n? In the above model, we postulate that \( \frac{{{\text{d}}Z}}{{{\text{d}}t}} \) is equal to k, or π2 is equal to (π1 + π1/n). When n is infinitely large, this postulation is reasonable. However, when n is moderately large, the postulation receives criticism. There are many biological parameters which show cyclical or periodic variations, for example follicle-stimulating hormone or luteinizing hormone levels in female blood plasma. Another criticism is that the observed variable might reach the critical point after the steady state and change drastically without exposures. Before executing steady state trials in medicine, we have to examine biologically the trial condition for the possibility of cyclical or drastic state change. If some period ratio is thought to be critical, we have to avoid using such n for steady state trials.


We have deduced steady state trials (SSTs) from the counterfactual model, from which randomized controlled trials (RCTs) were also deduced. Although RCTs are thought to be paramount trials in recent clinical research, STTs can also offer the valid method to measure causal effects, when the state before the exposure is steady and large changes are immediately observed. The smaller the ratio of the after-period to the before-period is, the better we can rely on the measurement of the causal effect. When the after-period is relatively long, the measurements of SSTs may be confounded and RCTs should be considered in such situations. RCTs are also necessary when outcomes long after the exposure are important, even if SSTs show causal effects immediately.

Individual causal effects are defined as a contrast of the counterfactual outcomes. Because only one of those values is observed, it has been proposed that individual causal effects cannot be identified in epidemiological research [18, 19]. The epidemiologic principle is that a person may be exposed to an agent and then develop disease without there being any causal connection between exposure and disease [9]. SSTs show that we can measure individual causal effects in the condition where repeated observations are performed, the state before the exposure is steady, and large changes are immediately obtained after the exposure. This approach could open the door to the individual causal inference and other conditions for the individual causation that should be researched in epidemiology.

One example of steady state trials is parachute use in skydiving [5]. At the height of 4000 m, we jump into the sky and we are falling at the terminal velocity of 55 m/s after a few seconds. Within 3 s after opening parachutes, we usually fall at the next terminal velocity of 5 m/s. Now we record acceleration values at the interval of 3 s. Once we have the terminal velocity of 55 m/s, the acceleration value of 0 m/s2 is observed about twenty times before opening parachutes and the deceleration value of 17 m/s2 for 3 s is observed one time after opening parachutes. The Y imp is the deceleration value of 17 m/s2 for 3 s. After sampling one successful skydiving, the 95 % credible interval of the posterior distribution Δ with prior Beta(0.5, 0.5) is computed as 0.12–0.99. In 2010, 1308 members of the United States Parachute Association (USPA) reported skydiving injuries requiring medical attention [20]. During the same year, USPA members and first-time students made roughly 3 million jumps. These data may be translated into the following sample distribution.
$$ \begin{gathered} {\text{Under the exposure }}\quad 2 9 9 8 6 9 2 { }\,Y_{\text{imps}} {\text{ out of 3}}000000{\text{ trials}} \hfill \\ {\text{Under the non-exposure}}\quad {\text{No }}Y_{\text{imp}} {\text{ out of 3}}000000 \times 20{\text{ trials}} \hfill \\ \end{gathered} $$

We adopt Beta(0.5, 0.5) as the prior distributions for θe and θu. However, the huge sample size fixes practically the same posterior distribution as when we believe the prior probability of θu is almost null. The 95 % credible interval of Δ is computed as 0.9995–0.9996 with WinBUGS. Classical statistics show that the 95 % confidence interval of the safe skydiving proportion is 0.99954–0.99959.

SSTs are practicable in the situation where immediate clinical responses are important, such as in the emergency room, where confounders are under the control of practitioners. Many treatments in emergency medicine have a long good history of SSTs in innumerable persons and can be regarded as scientific medical interventions without RCTs, such as intravenous injection of glucose for patients in hypoglycemic coma, injection of adrenalin (epinephrine) for patients with anaphylactic shock, a tourniquet for bleeding patients, and so on.

For example, one-person SST is presented in the use of adrenalin injection for an imaginary adult patient with anaphylactic shock. The data of the patient is shown in the Table 1. The systolic blood pressures (SBP) were recorded at the interval of 1 min. The patient had an intravenous injection of 0.1 mg adrenalin at time 10 min. We first check whether \( \frac{{{\text{d}}S}}{{{\text{d}}t}} \) in the period B follows normal distribution. If there are any outliers, SSTs is not the choice for this trial. The data of the table can be considered as following normal distribution. The mean of (k + δ is ) is one. The estimated variance of (k + δ is ) is
Table 1

Systolic blood pressure in an imaginary adult patient with anaphylactic shock

Time (min)

S (mmHg)

dS/dt (mmHg/min)






































Systolic blood pressure of the patient was recovering slowly without treatments. The patient had an intravenous injection of 0.1 mg adrenalin at time 10 min

S Systolic blood pressure

[(−1)2 + (−2)2 + 02 + 22 + (−1)2 + 02 + 12 + 02 + 02 + 12]/(10 − 1) = 1.33.

The standard error of [(Y − X)/A − (k + δ is )] is √[1.33(1/1 + 1/10)] = 1.21.

Two-sample t statistic is (32 − 1)/1.21 = 25.6,

which follows the t distribution on (1 + 10 − 2) = 9 degrees of freedom. The P value is computed as <10−8. In this statistical estimation, we postulate that the distribution of (– X)/A follows the normal distribution with the same variance as σ2 which is estimated by the sample variance of (k + δ is ). However, we do not really have data for the estimation of the variance of (– X)/A. We propose the above P value as an informal index to be used as a measure of discrepancy between (Y – X)/A and (k + δ is ). We recommend that this P value should be smaller than 0.001 to show discrepancy. SBP over 90 mmHg is a clinically important value in shock, and we can consider that \( \frac{{{\text{d}}S}}{{{\text{d}}t}} \) of 32 mmHg/min for 1 min after the exposure is Y imp. With prior Beta(0.5, 0.5) for θe and θu, the 95 % credible interval of the posterior distribution Δ is computed as 0.09–0.99. When we have the prior information that Y imp have not been observed in the past unexposed states for 1000 min in total, we can adopt Beta(0.5, 1000.5) as the prior distribution for θu and n can be smaller to infer causal effects, if we confirm that unexposed state in the period B is in the same condition as the previously reported unexposed states. After observing one success of Y imp after the exposure, with prior Beta(0.5, 0.5) for θe, prior Beta(0.5, 1000.5) for θu and n = 1, the 95 % credible interval of the posterior distribution Δ is computed as 0.15–1.00. Practically, n will be more than two in order to confirm the equal conditions, even when we adopt almost null prior for θu. When we execute SSTs in population studies, if some period ratio is thought to be critical, the participants are divided into two or three groups to which different values of n are allocated. RCTs between the groups are also possible in this setting.

SSTs are also possible in treatment trials of neurodegenerative diseases whose patients almost always show progressive deteriorations. For example, although RCTs has not been performed in the levodopa therapy of Parkinson’s disease [21], neurologists will admit that symptoms lasting for several months of early Parkinson’s patients almost always improve within a few days after receiving levodopa.

In preventive medicine, oral rehydration therapy is effective against diarrhea [22]. RCTs have compared oral rehydration with intravenous hydration [23]. SSTs can offer the measurement of the effect difference between the treated and the non-treated in acute stage diarrhea. Because of the necessity of controlling confounders, SSTs may be restricted within a narrow set of research topics in preventive medicine. However, if we have sufficient past databases of disease incidence and the derivative of the incidence rate with respect to time is constant, we may use SSTs for the measurement of effects of exposures which have a latent period of several years.




We are grateful to Prof. Akio Koizumi for useful comments.

Conflict of interest

The authors declare that they have no conflict of interest.

Open Access

This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.

Authors’ Affiliations

Iwami Neurological Clinic
Kyoto Industrial Health Association


  1. Evidence-Based Medicine Working Group. Evidence-based medicine. A new approach to teaching the practice of medicine. JAMA. 1992;268:2420–5.View ArticleGoogle Scholar
  2. Sackett DL. Chapter 6: the principles behind the tactics of performing therapeutic trials. In: Haynes RB, Sackett DL, Guyatt GH, Tugwell P, editors. Clinical epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2006. p. 173–243.Google Scholar
  3. Straus SE, Glasziou P, Richardson WS, Haynes RB. Evidence-based medicine. How to practice and teach it. 4th ed. London: Elsevier Churchill Livingstone; 2011.Google Scholar
  4. Centre for evidence based medicine. Oxford centre for evidence-based medicine 2011 levels of evidence.
  5. Smith GC, Pell JP. Parachute use to prevent death and major trauma related to gravitational challenge: systematic review of randomised controlled trials. BMJ. 2003;327:1459–61.PubMedView ArticleGoogle Scholar
  6. Little RJ, Rubin DB. Causal effects in clinical and epidemiological studies via potential outcomes: concepts and analytical approaches. Annu Rev Public Health. 2000;21:121–45.PubMedView ArticleGoogle Scholar
  7. Maldonado G, Greenland S. Estimating causal effects. Int J Epidemiol. 2002;31:422–9.PubMedView ArticleGoogle Scholar
  8. Höfler M. Causal inference based on counterfactuals. BMC Med Res Methodol. 2005;5:28.PubMedView ArticleGoogle Scholar
  9. Rothman KJ. Chapter 3: measuring disease occurrence and causal effects. In: Rothman KJ, editor. Epidemiology: an introduction. New York: Oxford University Press; 2002. p. 24–56.Google Scholar
  10. Maclure M. The case-crossover design: a method for studying transient effects on the risk of acute events. Am J Epidemiol. 1991;133:144–53.PubMedGoogle Scholar
  11. Maclure M, Mittleman MA. Should we use a case-crossover design? Annu Rev Public Health. 2000;21:193–221.PubMedView ArticleGoogle Scholar
  12. Spiegelhalter DJ, Abrams KR, Myles JP. Bayesian approaches to clinical trials and health-care evaluation. Chichester: Wiley; 2004.Google Scholar
  13. Gelman A, Carlen JB, Stern HS, Rubin DB. Bayesian data analysis. 2nd ed. New York: Chapman & Hall/CRC; 2004.Google Scholar
  14. Medical Research Council and Imperial College of Science, Technology and Medicine. WinBUGS PACKAGE.
  15. Miettinen O, Nurminen M. Comparative analysis of two rates. Stat Med. 1985;4:213–26.PubMedView ArticleGoogle Scholar
  16. Newcombe RG. Interval estimation for the difference between independent proportions: comparison of eleven methods. Stat Med. 1998;17:873–90.PubMedView ArticleGoogle Scholar
  17. Armitage P, Berry G, Matthews JNS. Cross-over trials in Chapter 18. In: Armitage P, Berry G, Matthews JNS, editors. Statistical methods in medical research. 4th ed. Oxford: Blackwell Science; 2002. p. 627–36.View ArticleGoogle Scholar
  18. Hernán MA. A definition of causal effect for epidemiological research. J Epidemiol Community Health. 2004;58:265–71.PubMedView ArticleGoogle Scholar
  19. Suzuki E, Komatsu H, Yorifuji T, Yamamoto E, Doi H, Tsuda T. Causal inference in medicine part I—counterfactual models—an approach to clarifying discussions in research and applied public health. Nihon Eiseigaku Zasshi. 2009;64:786–95. (in Japanese).PubMedView ArticleGoogle Scholar
  20. United State Parachute Association. Skydiving safety.
  21. Ad Hoc Committee on the Guidelines for the Treatment of Parkinson’s Disease, Japanese Neurological Society. Guidelines for the Treatment of Parkinson’s Disease. Clin Neurol. 2002;42:421–94. (in Japanese).Google Scholar
  22. Victora CG, Bryce J, Fontaine O, Monasch R. Reducing deaths from diarrhea through oral rehydration therapy. Bull World Health Organ. 2000;78:1246–55.PubMedGoogle Scholar
  23. Bellemare S, Hartling L, Wiebe N, Russell K, Craig WR, McConnell D, et al. Oral rehydration versus intravenous therapy for treating dehydration due to gastroenteritis in children: a meta-analysis of randomised controlled trials. BMC Med. 2004;2:11.PubMedView ArticleGoogle Scholar


© The Author(s) 2012