Gentle Intro

  • Hypothesis
    • after Popper Logic
    • every hypothesis has to be able to be disproven
  • null-Hypothesis
    • default value (status quo) for a parameter (until proven false)
    • like defendant in court … unguilty until proven otherwise
    • denoted as
  • alternative Hypothesis
    • deviation from current knowledge
    • must be proven to be valid
    • denoted as

Example Machine

  • machine must produce with mean diameter of 0.5 inch

Testing

  • possible outcome of any Test
    • reject the null hypothesis
      • finding a non-white swan (significant result)
    • fail to reject the null hypothesis
      • even tho I fail to reject, I still do not accept the null-Hypothesis
        • there can still be a black swan out there
      • in reality, large enough sample size might result in “impractical” status
        • no further research to be done
  • rejecting is positive

Errors

  • Errors
    • standard deviation … data - how spread out the data points are
    • standard error … meaning - how relevant/meaningful the conclusions are

Ingredients

  • confidence level
    • e.g. 99%
  • rejection region
    • defining when is rejected in favor of
    • e.g. when arbitrary experiment result is greater than 5
    • rejection region is always outside of confidence interval
  • test statistic
    • depends on problem we have

Interpretation

  • when result is inside confidence interval
    • i.e. outside the rejection region
    • we know that we cannot reject , but still not accept it
    • at the current confidence level

Tests

Population Mean

One-Tailed

  • upper or lower
  • Theta (measured) and (expected) are placeholder for the corresponding values compared
  • ignored in this course, but not hard to grasp or adjust the formulas

Two-Tailed

  • two-tailed
  • then we collect sample data and get and
    • or sample standard deviation if is not known
  • therefore for large samples:
    • for small samples:
  • choose significance level
    • reminder: = chance of Type I error
    • region within confidence interval do not reject
    • region outside confidence interval reject
    • confidence interval can be constructed without data!
      • only distribution type, sample size and needed
  • -critical value (end points of rejection region)
    • if is large (> 30) CLT
    • if is small and population is normally distributed

Population Proportion

-Values

  • the probability of obtaining a sample “more extreme” than the one observed in the data set, assuming that is true
  • basically reversing the calculation
    • finding for the given (two-sided CI)
  • leaving it up to the reader to interpret the result
  • p-value =
    • for will be negative
    • for will be positive