Thursday, April 9, 2020

Visualizations, Linear Regression and Correlation

Comic of the Day
Tip: Beware the temptation of the procrastination side


Announcements and Deadlines
Labs 6, 7, 8 and 10 due Thu Apr 16
Excel Final Exam on Thu Apr 16
Chapter 6-8 and Visualizations/Chapter 12 assignments (Moodle) due Thu Apr 23
Theory Exam (Chapter 6-8) is combined with Chapter 12 exam and will be on Thu Apr 23



This Week

1. Visualizations

To supplement the lecture notes, read the excerpts from Tufte's "Visual Display of Quantitative Information", the article "Charting Safety Performance" from the American Society of Safety Professionals, and "Leading Indicators for Workplace Health and Safety" from Work Safe Alberta.




2. Linear Regression

Tip: Some relationships are linear, some can't be explained...

Guess the regression line!
https://www.geogebra.org/m/JsFmFEg6

3. Correlation

Spurious Correlations
Have too much time on your hands?  Try this!
http://guessthecorrelation.com/
Notes on Moodle

For More Information

Next Week - Exams
Look at Exam Prep Sheet and other study resources

Saturday, March 21, 2020

Sampling Distributions, Central Limit Theorem and Confidence Intervals

Comic of the Day
Tip: The things themselves are all right, so who cares?


Announcements and Deadlines
Labs 6, 7, 8 and 10 due Thu Apr 16
Excel Final Exam on Thu Apr 16
Chapter 6-8 assignment (Moodle) due Thu Apr 23
Theory Exam (Chapter 6-8) is combined with Chapter 12 exam and will be on Thu Apr 23


This Week

1. Sampling Distributions and Central Limit Theorem

Central Limit Theorem Simulator

Set n=25 and run the simulator several hundred thousand times. 
For any population distribution (normal, uniform, skewed, custom), the distribution of the sample mean, is normally distributed with the following properties:


  • the mean of the sample means approaches the population mean





  • the standard deviation of the sample means is less than the population standard deviation (it actually follows the formula below)

  • Lab 8 - Sampling Distributions
    Lab 8 shows how a sample mean, x̄ (n=16 or 100) is usually close to a population mean, µ (N=10,000), and gets closer as n increases.  You will use a VLOOKUP function to randomly sample from your payroll data.

    3. Confidence Intervals
    "A Mainstreet Technologies poll ... suggests Albertans are highly resistant to any tax hikes. Only 15 per cent of survey respondents say they’d favour increased taxes. Spending cuts are the preferred solution of 43 per cent."
    "...automated phone survey of 3,184 Albertans."
    "The Mainstreet Technologies survey is considered accurate within plus or minus 1.7 percentage points, 19 times out of 20."

    Confidence Interval for Population Proportion, p

    Confidence Interval for Population Mean, µ (σ known)

    Confidence Interval for Population Mean (σ unknown)

    For More Information
    Confidence Intervals (videos go beyond the content of this course)
    Confidence Intervals (small sample size)

    Next Week - Linear Equations and Graphs, Lab 10 - Linear Regression
    Pre-read notes on Moodle

    Thursday, March 19, 2020

    Lab 7 - Normal Distributions

    Comic of the Day
    Tip: We might all be a little out of practice, so go easy on us?

    Announcements and Deadlines

    Production Assignment 2 assessments and peer/self evaluation due Sun Mar 22
    Chapter 6-8 assignment (Moodle) due Mon Mar 30
    Theory Exam (Chapter 6-8) postponed until further notice

    This Week - Lab 7 - Normal Distribution
    Lab instruction video on Moodle
    Mobius questions


    Next Week - Lab 8 - Sampling Distributions
    Lab instruction video and Mobius questions are already posted

    Thursday, March 12, 2020

    Lab 6 - Probability

    Comic of the Day
    Tip: Watch your syntax when using the =FRIENDIF() function

    Applications of Visualizations
    Dashboard



    Understand statistics, don't BE a statistic


    Prevention
    To help protect you and your family against all respiratory illnesses, including flu and COVID-19, you should:

    • Wash your hands often and well
    • Avoid touching your face, nose, or mouth with unwashed hands.
    • Avoid close contact with people who are sick
    • Clean and disinfect surfaces that are frequently touched
    • Stay at home and away from others if you are feeling ill
    • Contact your primary health provider or Health Link 811 if you have questions or concerns about your health
    • When sick, cover your cough and sneezes and then wash your hands
    Announcements and Deadlines
    Production Assignment 2 assessments and peer/self evaluation due Sun Mar 15
    Chapter 6-8 assignment (Moodle) due Mon Mar 30
    Theory Exam (Chapter 6-8) in class Mon Mar 30

    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - Lab 6 - Probability
    See videos on Moodle
    Activity 1 is related to injury rates in your individual data sets
    In Activity 2 you will create a binomial probability distribution related to injury rates, and then also create a model for workplace injuries using random number generation in Excel.  Excel's built-in Binomial Distribution function with the following syntax:
    BINOM.DIST(x,n,p,false)


    Next Lecture - 7.1-7.3 Sampling Distributions and Central Limit Theorem
    Pre-read 7.1-7.3 in the textbook
    Watch this 10:41 primer video on the Central Limit Theorem
    https://www.khanacademy.org/math/ap-statistics/sampling-distribution-ap/sampling-distribution-mean/v/sampling-distribution-of-the-sample-mean

    Wednesday, March 11, 2020

    6.2 Applications of Normal Distributions

    Comic of the Day
    Tip: Always make the diagram match one on our z table

    Unit 2 Exam (out of 25 marks)
    µ=64.3%
    σ=20.5%
    median=66%
    60% within µ±1σ


    Unit 1 Exam (was changed to out of 25 marks instead of 30)
    µ=58.1% 69.7%
    σ=18.4% 22.1%
    median=60% 72%
    70.0% within µ±1σ

    For students who scored 100% on Mobius chapter 1-4 assignments:
    µ=75.8% on Unit 1&2 Exams

    For students who scored <100% on Mobius chapter 1-4 assignments:
    µ=51.8% on Unit 1&2 Exams

    Announcements and Deadlines
    Production Assignment 2 assessments and peer/self evaluation due Sun Mar 15
    Chapter 6-8 assignment (Moodle) due Mon Mar 30
    Theory Exam (Chapter 6-8) in class Mon Mar 30

    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - 6.2 Applications of Normal Distributions
    Notes on Moodle
    z-score calculator



    Next Lecture - Lab 6 - Probability
    Watch the video tutorial and try it on your own

    Monday, March 9, 2020

    6.1 Standard Normal Distribution

    Comic of the Day
    Tip: It's not that scary if you practice

    Explanation of Grading for Production Assignment 1
    Submission Grade (ave 29.9/40) - instructor graded

    Individual Contribution (ave 8.9/10) - based on the Peer and Self Evaluation ratings, comments, and my observations

    Assessment Grade (ave 8.0*/10) - based on how close you graded the group versus how I graded the group, and also on the quality of your comments (specific and insightful)

    * students who did not assess other groups were assigned a grade of zero and were not included in this average

    Announcements and Deadlines
    Production Assignment 2 assessments and peer/self evaluation due Sun Mar 15
    Chapter 6-8 assignment (Moodle) due Mon Mar 30
    Theory Exam (Chapter 6-8) in class Mon Mar 30

    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - 6.1 Standard Normal Distribution
    Notes on Moodle
    z-score calculator

    For More Information
    Normal distribution


    Next Lecture - 6.2 Applications of Normal Distributions
    Pre-read notes on Moodle

    Thursday, March 5, 2020

    Feedback from PA1

    Feedback from PA1
    - read the rubric carefully and make sure you are meeting all the criteria (e.g. whether it's a table or a chart, what the rows/columns should be, how much/little information to include, what Excel functions/features should be enabled like a Filter)
    - who is your intended audience?  An email to a front line supervisor should be short and to the point.  For PA2, a briefing report for senior management should have a little more substance but be careful not to include redundant sentences just to "flower things up"
    - the analysis has to be correct!  In PA1 you needed to correctly count the number of employees in each category (never trained, expired, will expire and by each training type).  For PA2 you will need to use correct data for all of your charts.

    Suggestions for PA2
    - double-check that you haven't missed any data or counted anything twice
    - each chart should have a clear message and a reason why you have included it
    - three mandatory charts (raw data, histogram, ogive)
    - two or more additional charts (e.g. pie, stacked column, boxplot) that focus on overexposures, other contributing factors or changes over time
    - do not include your names anywhere.  If you need a name you are Saif T. Guy.  In the workplace you will have to be careful not to violate employee privacy and accidentally include their name on an injury report!

    Remember:
    Under no circumstances should you be sharing your files (including the raw data files) with other groups, and under no circumstances should you open files from other groups.

    Wednesday, February 26, 2020

    Injury rates

    Comic of the Day

    Tip: So what percent of the vote won't you get?

    Deadlines
    Chapter 3&4 assignment (Moodle) due Wed Mar 4
    Theory Exam (Chapter 3&4) in class Wed Mar 4
    Production Assignment 2 (due Sun Mar 8instructions and groups have been sent out, rubric is posted on Moodle
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)


    Today - 4.1-4.3 Expected Value and Binomial Distribution
    Rates allow us to compare populations better than just frequency
    http://www.iflscience.com/health-and-medicine/fetal-deaths-rocketed-in-flint-following-lead-poisoning-of-water/

    Notes on Moodle

    Next Lecture - Lab 5 - Descriptive Statistics
    Pre-read the lecture notes on Moodle

    Monday, February 24, 2020

    4.1-4.3 Expected Value and Binomial Distribution

    Comic of the Day

    Tip: Some components of group assignments should be done together, then assign individuals to finalize separate parts later.  Allow time to discuss/evaluate each group members contributions and make changes if necessary.

    Warmup
    1. Which of the two probabilities below would be easier to calculate? (Hint: Think if AND or OR would be required)
    a) What is the probability that at least two people in our class of 30 students share a birthday
    b) What is the probability that there are no shared birthdays?

    2. What do we call the relationship between these probabilities?

    3. Solve using Excel


    Deadlines
    Chapter 3&4 assignment (Moodle) due Wed Mar 4

    Theory Exam (Chapter 3&4) in class Wed Mar 4
    Production Assignment 2 (due Sun Mar 8instructions and groups have been sent out, rubric is posted on Moodle
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)


    Today - 4.1-4.3 Expected Value and Binomial Distribution
    a) If the token is always dropped in the middle, do all of the outcomes (prize values) have an equal likelihood?
    b) What would happen if the Plinko board wasn't perfectly level (i.e. it was tilted slightly sideways)?

    Simulator - Set rows = 26 and binary probability p = 0.10 to simulate a company with 26 employees and an injury rate of 10.0 (%).  How many injuries are possible?  How many injuries are expected?

    Notes on Moodle

    Next Lecture - Injury Rates
    Pre-read the lecture notes on Moodle

    Wednesday, February 12, 2020

    3.4-3.5 Complements, Probability in Sampling

    Comic of the Day
    Tip: OSH shells sea shells with proper PPE

    Warmup
    There are 2 red cards and 18 black cards.  If the goal is to keep guessing until you have found a red card:
    a) What is the minimum number of guesses you might need to win?  What is the probability of this?
    b) What is the maximum number of guesses you could make and still lose?  What is the probability of this?

    Let's play as a class!  Each person gets 13 guesses.
    1) Write down your guesses in advance (cards are numbered 1 through 20)
    2) Everybody stands up
    3) I will choose somebody standing to make one of their guesses
    4) If their guessed card is red, then they and everybody else who guessed that card will sit down
    5) Continue until everybody has won or made 13 guesses

    How many people in a class of 30 would we expect to be left standing after 13 guesses?
    Hint:

    Announcements & Deadlines
    Production Assignment 2 instructions and groups will be sent out this week (due Sun Mar 8)
    Data Analysis Exam #1 in class Thu Feb 13
    - Excel keys for Labs 0-4 are available on Moodle
    Lab 4 Mobius questions due Fri Feb 14
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - 3.4-3.5 Complements, Probability in Sampling
    Notes on Moodle

    Next Lab - Data Analysis Exam #1
    • Arrive early!
    • Review the prep sheet and other resources on Moodle
    • It is open-book and you will have access to your own files on Moodle, but you should be organized so you know where to find things quickly.
    • You may have your personal laptop open, but you must make your Excel file on the desktop computer in the lab.
    • Save multiple copies of your file!

    Monday, February 10, 2020

    3.3 Addition and Multiplication Rules, Risk Analysis

    Comic of the Day
    Tip: Always make the best of a baaaad situation

    Warmup
    Monty Hall Problem
    http://www.youtube.com/watch?v=mhlc7peGlGg

    Is it better to swap doors, keep the same door or does it matter

    Hint: Draw a sample space, where the first event is your initial guess (car, goat or goat) and the second event is whether you swap or stay (win or lose).

    Don't believe me?  Check out the empirical probability:
    http://www.stayorswitch.com/

    Still don't believe me?  Read this:


    Announcements & Deadlines
    Production Assignment 2 instructions and groups will be sent out this week (due Sun Mar 8)
    Data Analysis Exam #1 in class Thu Feb 13
    - Excel keys for Labs 0-4 are available on Moodle
    Lab 4 Mobius questions due Fri Feb 14
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - 3.3 Addition and Multiplication Rules
    Notes on Moodle

    Next Lecture - 3.4-3.5 Complements, Probability in Sampling
    Pre-read notes on Moodle

    Unit 1 Exam
    µ=58.1%
    σ=18.4%
    median=60%
    70.0% within ±1σ


    If you want to ask about the marking of your exam I would be happy to discuss with you, but please wait 24 hours before emailing or asking me.  Take this time to review the answer key, lecture notes, assignments and sample exam from last year.  You can review your Chapter 1 and 2 Mobius assignments using the Mobius Gradebook link at the top of Moodle.

    Thursday, February 6, 2020

    Lab 4 - Relative Standing and Boxplots

    Comic of the Day
    Tip: Get pumped up about boxplots

    Data Analysis Exam #1 (Thu Feb 13)
    Covers Labs 0-4
    See the "Excel Midterm Exam Prep Sheet", "Old Excel Midterm Exam Files" and "Old Excel Midterm Exam Walkthough"
    • It is open-book and you will have access to your own files and the keys to the labs on Moodle, but you should be organized so you know where to find things quickly.

    Deadlines
    Read article on Moodle "What are the Odds?  The Probability of an Accident"
    Lab 3 Mobius questions due Fri Feb 7
    Production Assignment 1 assessments (inter-group and intra-group) due Sun Feb 9
    Data Analysis Exam #1 in class Thu Feb 13
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - Lab 4 - Relative Standing and Boxplots
    Instructions on Moodle


    Next Lecture - 3.3 Addition and Multiplication Rules, Risk Analysis
    Pre-read §3.3 in the textbook

    Wednesday, February 5, 2020

    3.1-3.2 Basic Concepts of Probability

    Comic of the Day
    Tip: Ignorance can be bliss, but statistically it's still a bad choice

    Deadlines
    Lab 3 Mobius questions due Fri Feb 7
    Production Assignment 1 assessments (inter-group and intra-group) due Sun Feb 9
    Data Analysis Exam #1 in class Thu Feb 13
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - 3.1-3.2 Basic Concepts of Probability
    Mathematicians of Catan
    a) When rolling one dice, what are all the possibilities?  Are they each equally likely?
    b) When rolling two dice, what are all the possibilities?  Are they each equally likely?
    c) When you SUM the rolls on two dice, what are all the possibilities?  Are they each equally likely?
    (Show with a table in Excel and conditional formatting)

    Notes on Moodle

    Next Lecture - Lab 4 - Relative Standing and Boxplots
    Preview the video tutorial

    Wednesday, January 29, 2020

    2.7 Measures of Variation

    Comic of the Day
    Tip: If at first you don't succeed, fudge the data


    Deadlines & Announcements
    Lab 2 Mobius questions due Fri Jan 31
    Production Assignment 1 due Sun Feb 2 (will need to submit draft to WriteOn several days in advance)
    Chapter 1&2 assignments (Moodle) due Mon Feb 3
    Watch your rounding for the assignments.  Keep extra decimals for intermediate answers, then round your final answer only.
    Theory Exam (Chapter 1&2) in class Mon Feb 3
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - 2.7 Measures of Variation
    Notes on Moodle
    Standard deviation using a dotplot


    Next Lab - Lab 3 - Charts and PivotTables
    Watch the video tutorial and refer to the sample pdf

    Monday, January 27, 2020

    2.5-2.6 Measures of Central Tendency

    Comic of the Day


    Tip: Laughter is the best medicine.  It makes the math hurt less.

    Canadian music to study to
    Tom Cochrane - "median"
    Tragically Hip - "meridian"


    Deadlines & Announcements
    Lab 2 Mobius questions due Fri Jan 31
    Production Assignment 1 due Sun Feb 2 (will need to submit draft to WriteOn several days in advance)
    Chapter 1&2 assignments (Moodle) due Mon Feb 3
    Watch your rounding for the assignments.  Keep extra decimals for intermediate answers, then round your final answer only.
    Theory Exam (Chapter 1&2) in class Mon Feb 3
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - 2.5-2.6 Measures of Central Tendency
    Notes on Moodle
    How are meanmedian, and mode affected by an outlier?

    Remember: A meridian is a line on a map, not a statistical term or what you crash your car into!



    Next Lecture - 2.7 Measures of Variation
    Pre-read §2.7
    Practice this online activity to calculate standard deviation
    https://www.khanacademy.org/math/probability/data-distributions-a1/summarizing-spread-distributions/a/calculating-standard-deviation-step-by-step

    Thursday, January 23, 2020

    Lab 2 - Frequency Distribution and Histograms

    Comic of the Day

    Tip: Always label your axes.

    Warmup
    Fill in the blanks with the words FUNCTION or FORMULA
    A __________ uses mathematical operators to calculate a value
    A __________ is built in and has a specific syntax for its inputs
    An example of a __________ is "=SUM(A1:A5)"
    An example of a __________ is "=A1+A2+A3+A4+A5"
    Avoid "=SUM(A1+A2+A3+A4+A5)"

    Deadlines & Announcements
    Lab 1 Mobius questions due Fri Jan 24
    Production Assignment 1 due Sun Feb 2 (will need to submit draft to WriteOn several days in advance)
    Chapter 1&2 assignments (Moodle) due Mon Feb 3
    Watch your rounding for the assignments.  Keep extra decimals for intermediate answers, then round your final answer only.
    Theory Exam (Chapter 1&2) in class Mon Feb 3
    Drop-in tutorials will run every Tue 12-12:50 in CAT286 (computer lab)

    Today - Lab 2 - Frequency Distribution and Histograms
    Video tutorial and Mobius questions on Moodle
    Check your email for other data files from me

    Tips:
    • Formulas or functions always start with =
    • Press Enter to accept a formula, or Esc to exit without saving changes
    • Start typing =COU in the formula bar in Excel and it will give you a choice of which function to use (e.g. COUNT, COUNTA, COUNTIF...) then the syntax for that function

      Next Lecture - 2.5-2.6 Measures of Central Tendency
      Pre-read §2.5-2.6
      Watch this 3:54 primer video on mean, median and mode

      https://www.khanacademy.org/math/probability/data-distributions-a1/summarizing-center-distributions/v/mean-median-and-mode