An Introduction to Data Science
Read more about this course
How to use this website
This website
- is intended for middle and high school students learning to apply statistics for the first time
- covers the basic statistical concepts and tests that are covered in an introductory statistics course
- was developed to accompany the learning materials of Wisconsin Fast Plants, but the knowledge can be applied for any statistics introductory course
- has code you can follow with RStudio to generate similar figures & graphs as the website. Find the code here.
Each lesson
- covers a basic overview of concepts and aims to provide the theoretical take aways without being bogged down by the math,
- has resources for students to learn more about the math behind each concept, but the main purpose of the website is to teach how to think about statistics.
Overall learning objectives
This website serves as an introduction to Data Science for middle and high school students and their teachers.
Among the main topics, this website focuses on:
- Why Statistics is useful in the real world
- Statistics to understand genetic diversity
- Mean, variances, correlations and other summary measures from data
- Probability distributions
- Hypothesis tests (t tests, ANOVA, chi square tests)
Topics and specific learning objectives
- Introduction: Why data science?
- By the end of this lesson, students will be able to:
- Identify why statistics is helpful in the real world for assessing differences and patterns
- Describe the challenges of determining differences and patterns
- Discuss a comparison model of genetic diversity (dogs) for reference when thinking through the following lessons
- By the end of this lesson, students will be able to:
- Basics of DNA:
- By the end of this lesson, students will be able to:
- Restate what genes and DNA are with analogies to cooking
- Define genetic diversity, genetic variation, and gene expression Review core concepts in plants and animal genetic diversity
- By the end of this lesson, students will be able to:
- Finding average ground: Computing means and medians
- By the end of this lesson, students will be able to:
- Calculate the mean and median
- Illustrate why the mean and median is useful when critically thinking about data
- Describe and recognize how the median and mean can be skewed
- By the end of this lesson, students will be able to:
- Variance and distributions:
- By the end of this lesson, students will be able to:
- Restate how the mean is helpful but is aided by variance to critically think about data distribution and skew
- Calculate and describe what standard deviation is for a sample
- Explain what distributions are and why the normal distribution is used frequently
- By the end of this lesson, students will be able to:
- Probability and z-scores:
- By the end of this lesson, students will be able to:
- Discuss the basics of events happening in terms of probability
- Describe how z-scores of a individual data point explain how common that data point is
- Interpret how z-scores interact with distributions and how we obtain p-values from distribution and tables
- By the end of this lesson, students will be able to:
- Hypothesis testing:
- By the end of this lesson, students will be able to:
- Describe and recognize a scientific hypothesis
- Describe and form null and alternative hypotheses
- Discuss how it’s impossible to know everything about all variables in space and time, so we reject the null hypothesis
- By the end of this lesson, students will be able to:
- Comparing 2 groups:
- By the end of this lesson, students will be able to:
- Demonstratethe value of testing two groups statistically
- Explain when and how to apply t-tests
- Demonstrate how to and critique an interpretation of a t-test including:
- Test results
- T statistic
- How to reject the null hypothesis and its meaning
- By the end of this lesson, students will be able to:
- Comparing 2+ groups:
- By the end of this lesson, students will be able to:
- Discuss the value of testing multiple groups statistically
- Explain when and how to apply ANOVA
- Demonstrate how to and critique an interpretation of an ANOVA test including:
- Test results
- F statistic
- How to reject the null hypothesis, and that this means there is a global group difference, but can’t tell which groups are different
- Define a post-hoc analysis and explain how a post-hoc analysis yields additional results about multiple group comparisons
- By the end of this lesson, students will be able to:
- Comparing frequencies:
- By the end of this lesson, students will be able to:
- Discuss the value of testing frequencies statistically for both goodness of fit and tests of independence
- Explain when and how to apply chi-squared tests
- Demonstrate how to and critique an interpretation of Chi-Square test including:
- Test results
- Chi-square statistic
- How to reject the null hypothesis, and that this means there is a global group difference, but can’t tell which groups are different
- By the end of this lesson, students will be able to:
- Correlations:
- By the end of this lesson, students will be able to:
- Discuss the value of testing two continuous variables statistically
- Explain the difference between correlation and causation
- Demonstrate how to and critique an interpretation of Pearson correlation and multiple regression test including:
- How to interpret an r-value
- How to interpret the results of a multiple regression test
- How to reject the null hypothesis and understand that rejecting the null hypothesis for a correlational test still means no causation
- By the end of this lesson, students will be able to:
- Statistics in the real world:
- By the end of this lesson, students will be able to:
- Discuss sample size and power
- Identify when to use and the value of non-parametric tests
- Describe effect size and why it is important in statistics
- Underlines the importance and lengthy process of data cleaning
- Illustrate real world applications of statistics that have been used to improve the world
- By the end of this lesson, students will be able to:
If you want more statistical information in a free textbook, check out the digital library called openstax and the textbook Free Introductory Business Statistics.
AI-generated artwork within the website
Images on this website are a mix of original and AI-generated images. Images that have a row of multicolored squares are from Dalle2. There are many more and newer generative AI platforms out there!
