Forum: open-discussion

RE: &quot;non-significant&quot; one-tailed test [ Reply ]
By: Chris Street on 2014-09-11 00:10

Thank you for such an in-depth response! I see this is a conceptual problem (post-hoc theorizing), one that the statistics are not going to be able to get it.

I suspect one could penalize the model that is in the contrary direction to your prediction. That sounds more fair than running both tests a priori because the better theories make directional predictions. Thanks for the input Richard, helps a lot =]

RE: "non-significant" one-tailed test [ Reply ]
By: Richard Morey on 2014-09-10 23:39

[forum:41441]

From a Bayes factor perspective, a one-tailed test is simply a way of instantiating the model that the effect size is positive. This might then be compared to the null hypothesis of no difference, or any other model that we might like. Significance testing is limited because it is not a model comparison technique; Bayes factor, however, is, and we can simply compare whatever models are appropriate for the problem at hand.

What is problematic - for both Bayesian and frequentist perspectives - is testing a model that was suggested by the data themselves. If you weren't considering the possibility of a negative effect before the experiment, using the data as an excuse to test this is questionable.

However, I would add that this simply underscores the importance of planning to test several possibilities. Testing both negative and positive effect sizes is easy in the BayesFactor package, so there's no reason to restrict your planning to a single test of the positive effect sizes vs the null. Here's an example:

> set.seed(1)
> y = rnorm(10,-.7)
>
> t.test(y,alternative = "greater")

One Sample t-test

data: y
t = -2.3002, df = 9, p-value = 0.9765
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
-1.020288 Inf
sample estimates:
mean of x
-0.5677972

> t.test(y)

One Sample t-test

data: y
t = -2.3002, df = 9, p-value = 0.04698
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-1.126194774 -0.009399663
sample estimates:
mean of x
-0.5677972

> # One-tailed, nonsignificant
> t.test(y,alternative = "greater")

One Sample t-test

data: y
t = -2.3002, df = 9, p-value = 0.9765
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
-1.020288 Inf
sample estimates:
mean of x
-0.5677972

> # Two-tailed, significant
> t.test(y)

One Sample t-test

data: y
t = -2.3002, df = 9, p-value = 0.04698
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-1.126194774 -0.009399663
sample estimates:
mean of x
-0.5677972

>
> # Both one-tailed tests, against the null
> ttestBF(y,nullInterval = c(0,Inf))
Bayes factor analysis
--------------
[1] Alt., r=0.707 0<d<Inf : 0.1226331 ±0%
[2] Alt., r=0.707 !(0<d<Inf) : 3.572891 ±0%

Against denominator:
Null, mu = 0
---
Bayes factor type: BFoneSample, JZS

Notice that the Bayes factor analysis shows a moderate amount of evidence for the negative effect sizes (in the row labeled [2]) and *against* the model with the positive effect sizes (row labeled [1]).

"non-significant" one-tailed test [ Reply ]
By: Chris Street on 2014-07-10 11:26

[forum:41438]

Hi All,

If you saw that subject line and managed to click on it without channeling the rage monster, well done to you.

This question is more on the theory side, I hope that's ok. Let's say you expect a particular drug to enhance physical fitness. You've craftily conducted a simple experiment that requires a t-test, and as a good, non-p-hacking scientist you decide to conduct a one-tailed t-test.

Much to your surprise, the means show the effect is in the opposite direction to what you predicted: there is a large effect size of the drug, but it actually decreased performance rather than enhanced it. To talk about p-values briefly, we would say that this is a non-significant finding, and be left in the unfortunate position of having to report that some large effect is spurious.

Under NHST that is entirely fine of course because we expect the p-value to bounce around at random if the null is true. And perhaps this really is no effect. The question I have though is whether the Bayes Factor (or other clever Bayesian approaches) might be able to distinguish this sort of large effect in the direction counter to the hypothesis as either being (a) a real effect in the wrong direction - the drug really dampens performance, or (b) this is a spurious finding.

Any thoughts would be appreciated!