Skip to content

Commit

Permalink
Generalized linear regression (#896)
Browse files Browse the repository at this point in the history
* notes from glm workshop

* Flesh out explanation of continuous and unbounded, and add box on probs, odds, and logits.

* Update metadata, add quick review of linear models

* Flesh out linear models review

* Start example with code

* Finish example

* Add R code used in module and missing image file

* Put in metadata, improve quiz questions

* Add missing metadata

* Correct version number after super handy autochecker caught my mistake!

* Update data example to sepsis

* Replace example data

* Reordering sections for improved flow (hopefully!)

* Reorder generalized linear regression (#914)

* fix mathjax

* general structure/reordering

* reordering/naming sections

* Rearrange example content and add quiz on outcome variables

* Add brief explanation of terms in linear model

* Add more explanation before linear model sepsis example

* Rework "probability and odds" section as "talking about chance", reordering some headings

* improve hyperlink text

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Improve hyperlink text

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking punctuation for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking punctuation for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking wording for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking wording for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking wording for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking punctuation for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Clarifying

* Improve quiz answer

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking punctuation for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking wording for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* Tweaking punctuation for clarity

Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>

* clarify convergence

Co-authored-by: Rose M. Hartman <rosemm@users.noreply.github.com>

* update answser explanation for clarity

Co-authored-by: Rose M. Hartman <rosemm@users.noreply.github.com>

---------

Co-authored-by: drelliche <99294374+drelliche@users.noreply.github.com>
Co-authored-by: franzenr <87659159+franzenr@users.noreply.github.com>
  • Loading branch information
3 people authored Apr 12, 2024
1 parent e9c62f1 commit 91a786f
Show file tree
Hide file tree
Showing 4 changed files with 573 additions and 0 deletions.
39 changes: 39 additions & 0 deletions generalized_linear_regression/example_data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
library(tidyverse)
# show the relationship between probability, odds, and logits
logit_demo <- data.frame(Logit = c(-Inf, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, Inf)) |>
mutate(Odds = exp(Logit),
Probability = ifelse(Odds < Inf, Odds / (1 + Odds), 1))
knitr::kable(logit_demo, digits = 3)

# set the random seed so results replicate exactly with the random number generators
set.seed(24601)

# sample size
n <- 100

# random sampling of 0 and 1 for sepsis
# sample from a normal distribution for heart_rate and temp, but use a higher mean if sepsis == 1
data <- data.frame(sepsis = sample(x = c(0,0,0,1),
size = n,
replace = TRUE)) |>
mutate(heart_rate = ifelse(sepsis == 1,
rnorm(n, mean = 100, sd = 10),
rnorm(n, mean = 95, sd = 10)),
temp = ifelse(sepsis == 1,
rnorm(n, mean = 102, sd = 1),
rnorm(n, mean = 101, sd = 1)))


base_plot <- ggplot(data, aes(y=sepsis, x=temp)) +
geom_point() +
theme_bw() +
labs(y = "Sepsis", x = "Temperature")

# try plotting the data with just a linear model
base_plot +
stat_smooth(method = "lm")
ggsave("linear_prediction.png", width = 5, height = 5, units = "in")

base_plot +
stat_smooth(method = "glm", method.args = list(family = "binomial"))
ggsave("logit_prediction.png", width = 5, height = 5, units = "in")
Loading

0 comments on commit 91a786f

Please sign in to comment.