Setup

  • Linear Regression
  • dependent variable … what I am trying to predict
    • continuous, quantitative data
  • independent variable … variables I am changing in the experiment
    • continuous or categorical
    • categorical … Dummy Variable
      • of category - 1 = of dummy variables
        • first category is reference category
  • value check
    • how much spread/variation is in the dependent variables
      • what percentage of DV values can be predicted with my IV values
    • … higher = better … capturing more variation of dataset
      • global test: check with all IVs
        • can also indicate if the model is good/bad
      • local test: check with single/some IVs but not all of them
        • can help isolate correlated IVs
  • vs
    • is independent of the number of variables
      • best used for global test
    • takes number of variables into account … get’s better with more variables
      • best used for deciding which local test is best
  • -Value < is normally 0.05 … 5%
  • simple regression (only 1 IV) vs multiple regression (multiple IVs)

Assumption

  • linear relationship between DVs and IVs
    • linear for linear regression
    • exponential for exponential regression, etc
  • normal distribution of errors
  • homoscedasticitytodo spellcheck
  • no multi-column linearity
    • IVs are independent of one another

Interpret Simple Regression Analysis

todo get a sample regression summary

Interpret Multiple Regression Analysis

todo get a multiple regression summary

Z-Transformation

  • doing a z-transformation scales all values along the same mean with a standard deviation of 1
  • allows us to also use the coefficient of the regression analysis to base our assumption of the data instead of just the p-value

Adding another IV / Predictor

  • some correlations will change
  • some correlations might flip signs

Model Assumptions

Predict using Regression Analysis

  • predict(mreg4, newdata=new)