Learning Objectives
SOURCE: Diva Jain
Statistics is the science of collecting, organising, summarising, analyzing, and interpreting data. Descriptive statistics refers to summarising data with numbers or pictures while inferential statistics is about making conclusions or decisions based on data. Inferential statistics uses data from a sample to make inferences about a population.
For example, if we are interested in the number of bacteria in a test tube, it is not possible to count every single individual cell. Instead, we could take a subset of the population and count it with a hemocytometer.
Think about the experiment of taking samples from the population to counting cells with a hemocytometer. What experimental details would help us to get a more accurate measurement from our samples of the true population size?
The goal of statistics is to use imperfect information (our data) in order to infer facts, make decisions, and make predictions about the world.
A statistic is a number or value measured within some particular context.
It is essential to understand the context: * Which data were collected? * How and why were the data collected? * On which individuals or entities were the data collected? * What questions do we hope to answer from the data?
We use statistics to (hopefully) tease apart causation from correlation using frameworks (briefly!): 1. Hypothesis testing - do the data at hand sufficiently support a particular hypothesis?
Estimation - not just is there an effect, how large is the effect size (point estimate) and how confident are you in it?
Causal inference - ascribe causal relationships to associations between variables
As biostatisticians, we have very big goals:
To a statistician the TRUTH is a population – a collection of all individuals of a circumscribed type (e.g. all microbiologists in Winnipeg), or a process for generating these individuals (e.g., the expectation from the process of flipping a fair coin). The population is characterized by its parameters (e.g. mean, variance).
It is almost always impractical or impossible to study every individual in a population. As such, we often deal with a sample – a subset of a population. We characterize a sample by taking an estimate of population parameters.
So a major goal of a statistical analysis is how to go from conclusions about a sample which we can measure and observe, to the population(s) we care about. In doing so we must worry about random differences between a sample and a population (known as Sampling Error, as well as any systematic issues in our sampling or measuring procedure which will cause estimates to reliably differ from the population (known as Sampling Bias).
A statistic is a number or value measured within some particular context.
It is essential to understand the context:
* Which data were collected?
* How and why were the data collected?
* On which individuals or entities were the data collected?
* What questions do we hope to answer from the data?
A common use of statistics (especially biostatistics) is hypothesis testing, where we use estimates from samples to ask if data come from populations with different parameters. This typically relies on statistical models.
To build and evaluate models, we need to consider the type of data and the process that generates these data. Variables are things which differ among individuals (or sampling units) of our study. So, for example, species, genotype, temperature, or the drug concentration in the media are all variables.
We often need to distinguish between explanatory variables, which we think underlie or are associated with the biological process of interest, from response variables, the outcome we aim to understand. This distinction helps us build and consider our statistical model and relate the results to our biological motivation.
In the literal meaning of the terms, a parametric statistical test is one that makes assumptions about the parameters (defining properties) of the population distribution(s) from which one’s data are drawn, while a non-parametric test is one that makes no such assumptions. In this strict sense, “non-parametric” is essentially a null category, since virtually all statistical tests assume one thing or another about the properties of the source population(s).
For practical purposes, you can think of “parametric” as referring to tests that assume the underlying source population(s) to be normally distributed; they generally also assume that one’s measures derive from an equal-interval scale. And you can think of “non-parametric” as referring to tests that do not make on these particular assumptions.
If data is perfectly normally distributed, the two sides of the curve are the exact mirror of each other and the three measures of central tendancy [mean (average, \(\mu\)), median (middle number) and mode (most commonly observed value)] are all exactly the same in the middle of the distribution.
Another important consideration is the variability (or amount of spread) in the data. You can see above that different normal distributions have different amount of spread, shown above with \(\sigma^2\) which is shorthand for variance (the square of the standard deviation). Variance is calculated by finding the difference between every data point and the mean, squaring them, and taking the average of those numbers. The squares weigh outliers more heavily than points that are close to the mean and prevents values above the mean from neutralizing those below. Standard deviation (the square-root of variance) is used more often, because it is in the same unit of measurement as the original data. If you remember the words Z-Score, that tells us how many standard deviations a specific data point lies above or below the mean.
There are three other aspects of the data to consider with regards to the normal distribution.
For todays purposes you can think about non-parametric statistics as
ranked versions of the corresponding parametric tests when the
assumptions of normality of variance are not met. We will discuss this
further as we proceed below.
SOURCE: Concepts and Applications of
Inferential Statistics, Towards Data Science blog
post and blog
post
In general when we are analyzing data what we are tying to do is define the relationship between:
The specific type of statistics required depends on the type of data, i.e., whether you have numerical (quantitative) or categorical data.
Categorical data represent characteristics and can be thought of as ways to label the data. They can be broken down into two classes: nominal and ordinal. Nominal data has no quantitative value and have no inherent order (e.g., growth or no growth of a population in a given drug concentration). By contrast, ordinal data represents discrete and ordered units (e.g., level of growth of a mutant compared to wildtype such as +1, +2, etc.).
There are also two categories of numerical data. It can be discrete, if the data can only take on certain values. This type of data can be counted but not measured (e.g., the number of heads in 100 coin flips). By contrast, continuous data can be measured (e.g., CFU counts, growth rate).
SOURCE: Modern Dive, Chapter 10
In a hypothesis test, we use data from a sample to help us decide between two competing hypotheses about a population. We make these hypotheses more concrete by specifying them in terms of at least one population parameter of interest. We refer to the competing claims about the population as the null hypothesis, denoted by \(H_0\), and the alternative (or research) hypothesis, denoted by \(H_a\). The roles of these two hypotheses are NOT interchangeable.
The claim for which we seek significant evidence is assigned to the alternative hypothesis. The alternative is usually what the experimenter or researcher wants to establish or find evidence for. Usually, the null hypothesis is a claim that there really is “no effect” or “no difference.” In many cases, the null hypothesis represents the status quo or that nothing interesting is happening. We assess the strength of evidence by assuming the null hypothesis is true and determining how unlikely it would be to see sample results/statistics as extreme (or more extreme) as those in the original sample.
Hypothesis testing brings about many weird and incorrect notions in the scientific community and society at large. One reason for this is that statistics has traditionally been thought of as this magic box of algorithms and procedures to get to results and this has been readily apparent if you do a Google search of “flowchart statistics hypothesis tests.” There are so many different complex ways to determine which test is appropriate.
You’ll see that we don’t need to rely on these complicated series of assumptions and procedures to conduct a hypothesis test any longer. These methods were introduced in a time when computers weren’t powerful. Your cellphone (in 2016) has more power than the computers that sent NASA astronauts to the moon after all.
We can actually break down ALL hypothesis tests into the following framework given by Allen Downey here:
From Allen’s blog post “There is still only one test”
Given a dataset, you compute a test statistic that measures the size of the apparent effect. For example, if you are describing a difference between two groups, the test statistic might be the absolute difference in means. I’ll call the test statistic from the observed data 𝛿*.
Next, you define a null hypothesis, which is a model of the world under the assumption that the effect is not real; for example, if you think there might be a difference between two groups, the null hypothesis would assume that there is no difference.
Your model of the null hypothesis should be stochastic; that is, capable of generating random datasets similar to the original dataset.
Now, the goal of classical hypothesis testing is to compute a p-value, which is the probability of seeing an effect as big as \(\delta^*\) under the null hypothesis. You can estimate the p-value by using your model of the null hypothesis to generate many simulated datasets. For each simulated dataset, compute the same test statistic you used on the actual data.
Finally, count the fraction of times the test statistic from simulated data exceeds 𝛿*. This fraction approximates the p-value. If it’s sufficiently small, you can conclude that the apparent effect is unlikely to be due to chance (if you don’t believe that sentence, please read this).
That’s it. All hypothesis tests fit into this framework. The reason there are so many names for so many supposedly different tests is that each name corresponds to
A test statistic,
A model of a null hypothesis, and usually,
An analytic method that computes or approximates the p-value.
To break this down a different way,
“Brandvain Chapter 16: Hypothesis testing”
Our goal in null hypothesis significance testing is to see if results are easily explained by sampling error. Let’s work though a concrete example:
So, say we did an experiment: we gave the Moderna Covid vaccine to 15,000 people and a placebo to 15,000 people. This experimental design is meant to
Imagine if the population that got the Covid vaccine, or it did not. Calculate parameters of interest (e.g. the probability of contracting Covid, or the frequency of severe Covid among those who caught Covid), or the frequency of severe reactions etc.. in the vaccinated and unvaccinated population. Compare these parameters across populations with and without the placebo. Let’s look at the estimates from the data!!
Vaccine group: 11 cases of Covid 0 severe cases. Placebo group: 185 cases of Covid 30 severe cases. So did the vaccine work? There are certainly fewer Covid cases in the vaccine group.
But these are estimates, NOT parameters. We didn’t look at populations, rather these results came from a process of sampling – we sampled from a population. So, these results represent all the things that make samples estimates differ from population parameters, as well as true differences between populations (if there were any). So, before beginning a vaccination campaign we want to know if results are easily explained by something other than a real effect.
What leads samples to deviate from a population?
Sampling bias Nonindependent sampling Sampling error
Our goal in null hypotheses significance testing is to see if results are easily explained by sampling error.
Step 1: Determine level of significance
Step 2: State the hypotheses
Step 3: State the decision rule
Step 4: Calculate test statistic
Step 5: Find the critical value
Step 6: Write a conclusion
We can think of these hypothesistesting stepa in the same context as a criminal trial in which a choice between two contradictory claims must be made:
Now let’s compare that to how we look at a hypothesis test.
This sets your \(\alpha\) value. It says how far out in the tails you consider to be extreme and unlikely to occur by random chance. It is also your Type 1 Error, the probability you will reject a claim that is true. In practice it is almost always 0.05
We will be willing to conclude in of vaccines having an effect if the p-value less than or equal to 0.05. The analogy to “beyond a reasonable doubt” in hypothesis testing is what is known as the significance level.
Give your null and alternative hypotheses in words or in symbols.
Convention:
\(H_0\) is always stated as: parameter = number
\(H_a\) has three possible forms:
parameter > number
parameter < number
parameter \(\neq\) number
and (2) are referred to as 1-tailed tests (3) is referred to as a 2-tailed test
We initially assume that \(H_0\) is
true. The null hypothesis \(H_0\) will
be rejected (in favor of \(H_a\)) only
if the sample evidence strongly suggests that
\(H_0\) is false. If the sample does
not provide such evidence, \(H_0\) will
not be rejected.
Give a precise statement of what must happen to reject \(H_0\).
Often: We will reject H0 if the p-value \(\leq\) \(\alpha\) = 0.05.
The test statistic provides a measure of the compatibility between the null hypothesis and our data. The value we calculate depends on the specifics of the test we are conducting (which is based on the question we have and the data we collected).
The p-value in a one-sided test (like this one) tells you the probability of observing a test statistic as extreme as this one, if the null hypothesis is true.
In a two-sided test, the p-value tells you the probability of observing a value as extreme in either direction.
Results that lead to the rejection of a null hypothesis are said to be statistically significant.
For this reason, statistical hypothesis tests are also referred to as tests of significance.
Statistical significance is an effect so large that it would rarely occur by chance alone.
Note that we never conclude that \(H_0\) is true. All we can say is that we have insufficient evidence to reject the null hypothesis. We never write (or say) that we accept the null hypothesis.
Gut instinct says that “Fail to reject \(H_0\)” should say “Accept \(H_0\)” but this technically is not correct.
Accepting
\(H_0\) is the same as saying that a
person is innocent. We cannot show that a person is innocent; we can
only say that there was not enough substantial evidence to find the
person guilty.
When you run a hypothesis test, you are the jury of the trial. You decide whether there is enough evidence to convince yourself that \(H_a\) is true (“the person is guilty”) or that there was not enough evidence to convince yourself \(H_a\) is true (“the person is not guilty”). You must convince yourself (using statistical arguments) which hypothesis is the correct one given the sample information.
The risk of error is the price researchers pay for basing an inference about a population on a sample. With any reasonable sample-based procedure, there is some chance that a Type I error will be made and some chance that a Type II error will occur.
Image source: unbiasedresearch.blogspot.com
If we are using sample data to make inferences about a population parameter, we run the risk of making a mistake. Obviously, we want to minimize our chance of error; we want a small probability of drawing an incorrect conclusion. A type I error is the rejection of a true null hypothesis (also known as a “false positive”), while a type II error is the failure to reject a false null hypothesis (also known as a “false negative”).
The probability of a Type I Error occurring is denoted by \(\alpha\) and is called the significance
level of a hypothesis test.
The probability of a Type II Error is denoted by \(\beta\). \(\alpha\) corresponds to the probability of
rejecting \(H_0\) when, in fact, \(H_0\) is true.
\(\beta\) corresponds to the
probability of failing to reject \(H_0\) when, in fact, \(H_0\) is false. Ideally, we want
\(\alpha\) = 0 and \(\beta\) = 0, meaning that the chance of
making an error does not exist. When we have to use incomplete
information (sample data), it is not possible to have both \(\alpha\) = 0 and \(\beta\) = 0. We will always have the
possibility of at least one error existing when we use sample data.
Usually, what is done is that \(\alpha\) is set before the hypothesis test is conducted and then the evidence is judged against that significance level. Common values for \(\alpha\) are 0.05, 0.01, and 0.10. If \(\alpha\) = 0.05, we are using a testing procedure that, used over and over with different samples, rejects a TRUE null hypothesis five percent of the time.
So if we can set \(\alpha\) to be whatever we want, why choose 0.05 instead of 0.01 or even better 0.0000000000000001?
Well, a small \(\alpha\) means the
test procedure requires the evidence against \(H_0\) to be very strong before we can
reject
\(H_0\). This means we will almost
never reject \(H_0\) if \(\alpha\) is very small. If we almost never
reject \(H_0\), the probability of a
Type II Error – failing to reject \(H_0\) when we should – will increase! Thus,
as \(\alpha\) decreases, \(\beta\) increases and \(\alpha\) increases, \(\beta\) decreases. We therefore need to
strike a balance, and 0.05, 0.01 and 0.1 usually lead to a nice
balance.
The third part of this discussion is power. Power is the probability of not making a Type II error, which we can write mathematically as power is 1 – \(\beta\). The power of a hypothesis test is between 0 and 1; if the power is close to 1, the hypothesis test is very good at detecting a false null hypothesis. \(\beta\) is commonly set at 0.2, but may be set by the researchers to be smaller.
Consequently, power may be as low as 0.8, but may be higher. There are the following four primary factors affecting power: * Significance level (\(\alpha\)) * Sample size * Variability, or variance, in the measured response variable * Magnitude of the effect of the variable
Power is increased when the sample size increases, as well as when there is a stronger effect size and a higher level. Power decreases when variance (\(\sigma\)) increases.
The idea that sample results are more extreme than we would reasonably expect to see by random chance if the null hypothesis were true is the fundamental idea behind statistical hypothesis tests. If data at least as extreme would be very unlikely if the null hypothesis were true, we say the data are statistically significant. Statistically significant data provide convincing evidence against the null hypothesis in favor of the alternative, and allow us to generalize our sample results to the claim about the population.
However, from the discussion of the tradeoff between \(\alpha\) and \(\beta\) hopefully you can see that strictly relying on p-values can be troublesome. There are many papers written about this.
The goal is to show that both in tandem can tell us the stories that lie in our data.
We’re going to start by going back to our one-dimensional visualizations, i.e., histograms.
Calb_R <- read_csv(here("data_in", "Calb_resistance.csv"))
## Rows: 501 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): strain, site, sex
## dbl (2): MIC (ug/mL), disk (mm)
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Calb_R <- Calb_R %>%
rename(MIC = `MIC (ug/mL)`, disk = `disk (mm)`)
Calb_R
## # A tibble: 501 × 5
## strain site sex MIC disk
## <chr> <chr> <chr> <dbl> <dbl>
## 1 s498 blood f 128 6
## 2 s499 blood f 128 6
## 3 s465 blood f 32 12
## 4 s480 blood f 64 12
## 5 s481 blood f 64 12
## 6 s466 blood f 32 13
## 7 s482 blood m 64 13
## 8 s483 blood m 64 13
## 9 s484 blood f 64 13
## 10 s486 blood f 64 13
## # ℹ 491 more rows
ggplot(data = Calb_R, mapping = aes(disk)) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 6 rows containing non-finite values (`stat_bin()`).
By eye, does this variable look normally distributed? Why or why not?
We can statistically test for normality using the
shapiro.test()
:
shapiro.test(Calb_R$disk)
##
## Shapiro-Wilk normality test
##
## data: Calb_R$disk
## W = 0.89958, p-value < 2.2e-16
Notice that we switched syntaxes above, to base R (i.e., we
used df$variable
). The majority, if not all of the common
statistical tests require this syntax, because they were developed prior
to the tidyverse set of commands. Although there are some
workarounds, we’re going to go back and forth a little bit from these
two syntaxes as required.
Now we’re going to look at multiple variables.
A reminder of what our dataset it:
glimpse(Calb_R)
## Rows: 501
## Columns: 5
## $ strain <chr> "s498", "s499", "s465", "s480", "s481", "s466", "s482", "s483",…
## $ site <chr> "blood", "blood", "blood", "blood", "blood", "blood", "blood", …
## $ sex <chr> "f", "f", "f", "f", "f", "f", "m", "m", "f", "f", "f", "f", "f"…
## $ MIC <dbl> 128, 128, 32, 64, 64, 32, 64, 64, 64, 64, 64, 64, 32, 32, 32, 6…
## $ disk <dbl> 6, 6, 12, 12, 12, 13, 13, 13, 13, 13, 13, 13, 14, 14, 14, 14, 1…
First let’s ignore the different sites and test whether disk diffusion MIC differed between strains isolated from males and females. To do that we’re going to use a two sample t-test. The assumptions of a t-test are normally-distributed data. So first let’s subset the data and test for normality.
Calb_R_f <- filter(Calb_R, sex == "f")
Calb_R_m <- filter(Calb_R, sex == "m")
shapiro.test(Calb_R_f$disk)
##
## Shapiro-Wilk normality test
##
## data: Calb_R_f$disk
## W = 0.87021, p-value = 6.053e-14
shapiro.test(Calb_R_m$disk)
##
## Shapiro-Wilk normality test
##
## data: Calb_R_m$disk
## W = 0.91646, p-value = 2.678e-10
We already saw this was not met on the whole data set so this is not surprising. So instead we will do the non-parametric wilcoxon test (or Mann-Whitney U test) that compares data ranks instead.
# specify x and y
wilcox.test(Calb_R_f$disk, Calb_R_m$disk)
##
## Wilcoxon rank sum test with continuity correction
##
## data: Calb_R_f$disk and Calb_R_m$disk
## W = 30727, p-value = 0.928
## alternative hypothesis: true location shift is not equal to 0
# use equation format
wilcox.test(disk ~ sex, data = Calb_R)
##
## Wilcoxon rank sum test with continuity correction
##
## data: disk by sex
## W = 30727, p-value = 0.928
## alternative hypothesis: true location shift is not equal to 0
ggplot(data = Calb_R, mapping = aes(disk, fill = site)) +
geom_histogram(binwidth = 2, na.rm = TRUE) +
theme_bw() +
labs(x = "disk diffusion zone of inhibition (mm)" , y = "Number of strains")
We can similarly ignore sex and test the effect of site. In this case we have more than two groups, so we’re going to use an ANOVA test. In reality a t-test is the same thing as an ANOVA, just with two groups instead of more than two groups. The non-parametric equivalent of an ANOVA is the Kruskal-Wallis test, but the Kruskal-Wallis test assumes that sampled populations have identical shape and dispersion. We can see from our figure that this is not met. In this case it is actually better to use an ANOVA test. Although the ANOVA is parametric, it is considered a robust test against the normality assumption, that is non-normal data has only a small effect on the Type I error rate.
aov(disk ~ site, data = Calb_R)
## Call:
## aov(formula = disk ~ site, data = Calb_R)
##
## Terms:
## site Residuals
## Sum of Squares 1106.458 24891.534
## Deg. of Freedom 2 492
##
## Residual standard error: 7.112844
## Estimated effects may be unbalanced
## 6 observations deleted due to missingness
When you run the ANOVA we don’t actually get all the information we
need out of just the model aov
call. We need to wrap that
in a second function to pull out additional information:
anova_test <- aov(disk ~ site, data = Calb_R)
summary(anova_test)
## Df Sum Sq Mean Sq F value Pr(>F)
## site 2 1106 553.2 10.94 2.26e-05 ***
## Residuals 492 24892 50.6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 6 observations deleted due to missingness
There’s the stats. We’re going to install one more package, the broom package, that will clean up this output.
library(broom)
tidy(anova_test)
## # A tibble: 2 × 6
## term df sumsq meansq statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 site 2 1106. 553. 10.9 0.0000226
## 2 Residuals 492 24892. 50.6 NA NA
Using the broom functions tidy
we can easily access the
parameter values:
anova_test_tidy <- tidy(anova_test)
anova_test_tidy
## # A tibble: 2 × 6
## term df sumsq meansq statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 site 2 1106. 553. 10.9 0.0000226
## 2 Residuals 492 24892. 50.6 NA NA
anova_test_tidy$p.value[1]
## [1] 2.256929e-05
If we want to know which groups are different from each other, we can use the post-hoc (or “after the event”) tukey test:
TukeyHSD(anova_test)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = disk ~ site, data = Calb_R)
##
## $site
## diff lwr upr p adj
## oral-blood 3.965051 1.886492 6.0436086 0.0000271
## skin-blood 2.130479 0.458185 3.8027737 0.0080993
## skin-oral -1.834571 -3.923196 0.2540533 0.0982914
Conduct a t-test comparing skin and oral samples. First subset the data frame as needed, then check for normality. ?t.test might be helpful
In reality, we actually know that there are two different categorical variables here that could influence the disk diffusion resistance (site and sex), and we can include them both in one test, a two-way ANOVA.
full_anova_test <- aov(disk ~ site*sex, data = Calb_R)
tidy(full_anova_test)
## # A tibble: 4 × 6
## term df sumsq meansq statistic p.value
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 site 2 1106. 553. 10.9 0.0000228
## 2 sex 1 13.6 13.6 0.269 0.604
## 3 site:sex 2 116. 58.0 1.15 0.319
## 4 Residuals 489 24762. 50.6 NA NA
This hopefully tells us what we already intuited from the EDA: site but not sex influences disk resistance.
Conduct a statistical test to determine whether site or sex (or their interaction) has a significant effect on MIC.
ggplot(Calb_R, aes(MIC, disk)) +
scale_x_continuous(trans="log2", breaks = unique(Calb_R$MIC)) +
scale_y_reverse(limits = c(50, 0)) +
labs(y = "disk diffusion zone of inhibition (mm)" , x = expression(MIC[50])) +
theme_bw() +
geom_point(na.rm =TRUE) +
geom_jitter(alpha = 0.5, color = "tomato", width = 0.2)
## Warning: Removed 14 rows containing missing values (`geom_point()`).
And finally a statistical test to cap it all off, we’ll look for a correlation between these two resistance variables. We’ll again turn to our non-parametric statistics, and specify that we want Spearman’s rho test (in this case it’s the same function as the parametric test, we just specify the method we want to use).
cor_test <- cor.test(Calb_R$MIC, Calb_R$disk, method = "spearman")
## Warning in cor.test.default(Calb_R$MIC, Calb_R$disk, method = "spearman"):
## Cannot compute exact p-value with ties
cor_test
##
## Spearman's rank correlation rho
##
## data: Calb_R$MIC and Calb_R$disk
## S = 30335647, p-value < 2.2e-16
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.5661987
And we can again use the broom package to make the output more tidy and easier to access.
tidy(cor_test)
## # A tibble: 1 × 5
## estimate statistic p.value method alternative
## <dbl> <dbl> <dbl> <chr> <chr>
## 1 -0.566 30335647. 1.03e-42 Spearman's rank correlation rho two.sided
ggplot(Calb_R, aes(MIC, disk)) +
scale_x_continuous(trans="log2", breaks = unique(Calb_R$MIC)) +
scale_y_reverse(limits = c(50, 0)) +
labs(y = "disk diffusion zone of inhibition (mm)" , x = expression(MIC[50])) +
theme_bw() +
geom_point(na.rm =TRUE) +
geom_jitter(alpha = 0.5, color = "tomato", width = 0.2)+
geom_smooth(method = "lm", na.rm=TRUE)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 13 rows containing missing values (`geom_point()`).
This lesson was created by Aleeza
Gerstein at the University of Manitoba based on material from: Wikipedia Applied
Biostats Modern Dive,
Diva
Jain, Alan Downey,
:Probably Overthinking it” Introduction to
Statistical Ideas and Methods online modules, Statistics
Teacher: What is power?, Concepts
and Applications of Inferential Statistics, Towards Data Science blog
post, and blog
post. Brandvain
Chapter 1: Introduction to Statistics
Made available under the Creative Commons Attribution license. License.