# MATH5510 Topics in Mathematics for Teachers Summer 2016

Add: ignoring the baseline. Gelman and Nolan p. 149

Example of not taking independence into account, i.e. not using a hierarchical model (http://bmcbiol.biomedcentral.com/articles/10.1186/s12915-016-0227-8?utm_campaign=BMC30439C&utm_medium=BMCemail&utm_source=Teradata)

## Calendar

1
May 10

Introduction: Fun (cool?) things to think about in Stats.

line 2

2
May 12
3. May 17
4. May 19
5. May 24
6. May 26
7. May 31
8. June 2
9. June 7
10. June 9
11. June 14
12. June 16

## Course Outline

• General ideas: Just-in-time background: e.g. weighted means before Agresti Diagrams

### Does X 'cause' Y?: Causality and Statistics

• The importance of asking how?, not just what? Many statistics courses condition us to operate on and transform information. We easily forget the critical importance of askeing HOW the information was generated. The whole approach to information should be completely different depending on the answer to HOW. We explore the details. All statisticians can do with data is a set of complex rituals that is devoid of meaning unless we first understand how the data were generated.
• Agresti (find better name) diagrams: with data and speculative (find a better word maybe)
• Gelman (xxxx) "When Confounding Variables are Out of Control" (http://blogs.discovermagazine.com/neuroskeptic/2016/04/02/confounding-variables/#.Vx5ZHfkrIUF)
• Westfall J and Yarkoni (2016) Statistically Controlling for Confounding Constructs Is Harder than You Think. (http://www.ncbi.nlm.nih.gov/pubmed/27031707)
• Gelman (2016) on Confounding (http://andrewgelman.com/2016/02/22/its-too-hard-to-publish-criticisms-and-obtain-data-for-replication/)

### Visualizing Correlation

• Kahneman's pilots: link with causality: How two seeming contradictory views can be right

### Probability, Bayes and p-values

• Basic probability (avoid random variables -- delay until probability)
• Bayes: Trees and 2x2
• Monte Hall and the importance of modelling information (again: asking HOW, not just WHAT).
• Finite models (finite parameter space and finite sample space) (assumes some knowledge of matrices and linear algebra)
• Gigerenzer: did UK doctors use Bayesian nomograms. Does your doctor use a Bayesian nomogram. Building a nomogram app.
• Crisis of reproducibility: asking How: The effects of selection. Funnel graphs?

### Visualizing Multiple regression

If time permits which seems unlikely.

### Course Work

There are X components:

#### 1. Short daily quiz [20%]

The quiz is assigned at the end of each lecture and is due by 11:59 pm of the next day. It should be submitted through Moodle.

#### 2. Project and presentation [30%]

In teams of 2 assigned semirandomly. Prepare an essay and a lesson suitable for students in senior high school or in an introductory university statistics course on one of the following topics:

1. Bicycle helmets: What does the evidence say about their effectiveness?
2. The Gardasil Scare: Last year the Toronto Star published an article claiming that Gardasil,

a vaccine against HPV, had disastrous side effects. What does the evidence say?

1. If you enjoy programming, you can improve a collection of interactive 'shiny' apps by

rewriting them to include graphical interaction.

1. ... MORE

????

#### 1. Wiki contributions [10%]

Contribute to a course blog at least once a week. Your contribution can take many forms: a link to an interesting article or piece of news together with a comment, a question about some aspect of the course, an answer to someone else's question, a link to information relevant to the course, etc. If you are an outlier in the quality of your contributions, there's an opportunity for a few bonus marks here.

#### 2. Assignments [25%]

Four assignments on material covered in the course. Some of the assignments are individual assignments and some are team assignments.

Assignment Due Description Team or Individual
Description of
Assignment 1
Tuesday Jan 5
and Thursday Jan 7
11:59pm
Install software (R, RStudio, Git) and get a free account on Github.com.
Complete [[[Template:Math4939-survey]] this course survey].
Bring your laptop, well charged, to class on Wednesday, January 6 and Friday, January 8.
Individual
Description of
Assignment 2
Sunday
January 2431
11:59pm
Analyze and interpret Arrests data set in the effects package in R Teams using R Markdown and Github
Assignment 3 Thursday
February 11
11:59pm
Prepare and post on the wiki discussions of selected questions.

Teams posting on the course wiki

• Bayes: Qs: 7 9 18 20 3
• Fisher: Qs: 1 8 10 12 21
• Neyman: Qs: 4 11 14 19
• Pearson: Qs: 2 6 15 16
• Rao: Qs: 17 22 5 13
• Note: these questions are not absolutely final. Check with me before you start in earnest[GM]
Assignment 4 Thursday
February 24
March 3
11:59pm
Comment on discussion on selected questions.

Teams posting on the course wiki

• Bayes: Qs: 1 2 13 21
• Fisher: Qs: 11 6 7 5
• Neyman: Qs: 15 18 10 3 22
• Pearson: Qs: 20 19 9 17 4
• Rao: Qs: 8 14 16 12

#### 3. Odd jobs [5%]

There's always a host of interesting small questions and problems that come up in a course like this one. You get grades for performing an 'odd job' when you take on the responsibility to research and answer one of these questions and post the results on the wiki, perhaps in the form of a short tutorial helping others solve similar problems in their work. Insert a link to your work in the Odd Jobs page.

#### 4. Team project due March 21, 2016 [30%]

Change in deadline: Sunday, March 27, 2016
You will work on a team project in which you solve a real problem involving real data and prepare a report including analyses, graphical displays and a careful interpretation of your findings. The report has three parts:

1. A '.R' script using rmarkdown that produces a detailed analysis and presentation of your work, including diagnostics, etc. This output can be quite long.
2. A '.R' script using rmarkdown that produces an attractive and readable report with your main findings. You need to include all relevant references, data sources, etc. Aim for a maximum of 30 pages.
3. Slides for a 10-minute presentation discussed below. The slides can also be prepared with R-markdown using the ioslides format (http://rmarkdown.rstudio.com/ioslides_presentation_format.html).

You will collaborate using R, R Studio, R Markdown, git and github. The grade is based on the overall quality of the project (10%) and on your personal contribution to it (10%) and on your understanding of the issues and concepts in the project as shown in the final presentation and in project meetings with instructor. (10%).

You will prepare a brief summary of your project for a 10-minute presentation on Wednesday, March 23, and Friday, March 25. The 10-minute limit is strict. Be mindful that it takes careful preparation and rehearsing to give a good presentation in such a short time. You must rehearse as a group ahead of time. The presentation will be followed by a 5-minute question and discussion period.

Here are the four projects you can choose from. They are current and recent case studies used in the Statistical Society of Canada's annual case study competition. I choose these because the data for them is freely available to you although it might take some initiative to obtain it.

Projects: Recent -- and one current -- case study from the SSC Case Study competition:

Each team should choose a different one. I hope that you will agree on which one each team chooses. If not, we will have a draw at the class on Monday, February 8.

I am not sure if you forget to post a link for the projects or if there is a place where we can find them. [AA] Sorry, I must have made an error in cutting and pasting. It's corrected now. [GM]

#### 5. Individual or self-selected team project [10%]

This project provides you with a chance to pursue your own interest. If you plan to apply to more technical jobs, you might want to do an R package on Github, a good addition to your Github portfolio that technical employers and startups look at to assess job applicants. If you plan to apply to jobs at large institutions such as banks, you can work on SAS by, for example, implementing your project in SAS as far as possible. Submit a project proposal to the instructor by March 1. The project is due on April 4, 2016.

#### 6. A final 2-hour exam [20%]

The exam will be held in the regular exam period. A major component will consist of questions probing your understanding of statistical concepts reflected in the list of 20 22 questions.

## Possible Topics

### Syllabus

What is Statistics?
The precise boundaries of a discipline should never be made rigid. In fact, our organization of knowledge into separate disciplines is a curious thing in itself. It seems expedient to link some groups of ideas together because awareness of them seems to promote development in the investigation of questions that arise from these ideas.

• Inference under uncertainty?
• Extracting meaning from data?

#### Latest ideas

• Use R course for 1st six halves
• Incorporate R Markdown (http://rmarkdown.rstudio.com/), Shiny examples (http://shiny.rstudio.com/) and Tufte style (http://rmarkdown.rstudio.com/tufte_handout_format.html)
• Invite Heather
• Include regression ellipses, regression to the mean and Kahneman's pilots
• How to include conditioning?? Multiple regression? Stratification?? See how it's done.
• Multiple regression?? and ellipses??
1. A chance to play and learn neat stuff:
2. Some basic probability and statistics culminating with likelihood and Bayes with prior and posterior odds and likelihood ratio.
3. Together with some R using Markdown
4. An assignment every day to hand it the next?
5. Introductory: Why! Something about current statistics, the emerging crises and where's the resolution. Our current (conflicting) approaches are adopted by various groups as if they were absolutely correct recipes. Sometimes they work but often they don't but most don't see the problem. Two fundamental problems:
1. Interpretations of probability in application to the interpretation of scientific evidence.
2. The nature of causality and its inference from empirical observation. (mention briefly as motivation but don't belabor until ready later)
1. Bayes
• Inversion of conditionality
• Interpreting health tests
• p-values and Sally Clark
• Monte Hall: Using the model that generated information
2. Overview of issues: maybe history of probability and statistics: Bayesian talk -- but need probability review first -- background for Gigerenzer et al. Helping Doctors and Patients make Sense of Health Statistics
3. Causality:
• The critical role of randomization
• The usual impossibility of randomization and the limitations of experimentation (e.g. assessing effect on rare but important effects. Use Vioxx as an example)
4. Probability for Patients
1. Causality
2. Probability -- cognitive errors

### Topics

1. Import http://capstone.stats.yorku.ca/index.php/2016/Statistics#The_Crisis_of_Reproducibility_and_the_ASA
2. Causality: what do statistical analyses mean?
• Include implicit conditioning on colliders as creating 'spurious' correlation, e.g. among people admitted to hospital. Emphasize possibility of implicit conditioning, e.g. implicit selection.
• Climate change
• Racial profiling
• Drug side effects
3. Bayesian vs frequentist: why and when it matters?
• Sally Clark and Lucia de Berk
• Better Bayes: focus on prior odds, posterior odds and Likelihood ratio. Or on relative prob:

posterior relative prob = prior relative prob. x likelihood.

• Emphasizing Bayes Formula would be like teaching the formula for the probability of the union in terms of odds.
1. Regression to the mean: Kahneman's pilots: predictive vs causal
2. Reproducibility: a crisis in statistics, or a crisis in science?

In parallel?:

• Gelman's Teaching Statistics: A Bag of Tricks

+ a little bit of real programming

Over the same period, but especially since the 1990s, there has been an increasing disconnect between the traditional Fisher-Neyman-Pearson (FNP) math statistics course and the demands for complex analysis in many application areas. The failure of classical maximum likelihood methods to deal effectively with complex models and the success of MCMC-based methods has led to a similar situation: The undergraduate FNP course does not prepare students for these models, and Bayesian MCMC retraining courses are needed to prepare graduates for these applications. -- Murray Aitkin in Amstat News, March 2014, p. 28 (http://magazine.amstat.org/wp-content/uploads/2014/03/March2014.pdf)

MATH 5510: Topics in Mathematics for Teachers

Tentative Topic:

A challenge Statistics was late to face: The role of Prediction and Causation in understanding the world.

Understanding Bayes vs Frequentist inference: see https://youtu.be/BcvLAw-JRss from Walter Whiteley

Fridell (2004) "By the Numbers" (http://www.policeforum.org/assets/docs/Free_Online_Documents/Racially-Biased_Policing/by%20the%20numbers%20-%20a%20guide%20for%20analyzing%20race%20data%20from%20vehicle%20stops%202004.pdf) on statistical analyses of racial profiling.

## The Crisis of Reproducibility

Why so many statistically significant results fail to replicate. Is something fundamentally wrong with 'statistical significance'? Or is it more how we use it or interpret it that is the problem? There a psychology journal that is banning hypothesis testing entirely (http://www.nature.com/news/psychology-journal-bans-p-values-1.17001).

The American Statistical Association has jumped into the act by issuing a report on p-values that will have, I believe, seismic effects in the statistical profession and in research in general.

## Causality: A scientific problem or a statistical problem

• Have a look at work by Tina Grotzer.

## Taxonomy of Fallacies

• Base rate fallacies (https://en.wikipedia.org/wiki/Base_rate_fallacy): ignoring the prior when you shouldn't
Use probability of evidence given hypothesis as a measure of strength of evidence (e.g. p-value, sensitivity, specificity) and ignoring the base rate, e.g. the prior probability of the hypothesis. Some notable examples:
• Using observed likelihood instead of a posterior to form a judgment: You see an unkempt person with dirty glasses walking on campus? It is more likely to be a grad student in mathematics or a grad student in business. The likelihood (in the sense of P(data|hypothesis) is higher for mathematics than it is for business. But if you take the base rate (the number of grad students in either discipline) into account the posterior probability favours business.
• Representativeness heuristic (https://en.wikipedia.org/wiki/Representativeness_heuristic) and representation bias: one of the cognitive biases identified by Kahneman and Tversky. A mode of thought used by 'System 1'. It's easier to visualize P(Data|Hypothesis) and, having assessed it for various hypotheses, one is inclined to use that and not follow with the much more complex visualization that considers the base rate (prior) to compute a posterior. The relative probabilities for the posterior are very easily calculated by taking the product of the prior with the likelihood. Absolute probabilities are harder because they require norming. The progress on solving the norming problem, e.g. with MCMC, is one of the major recent advances that make Bayesian analysis much more computationally feasible than it used to be.
• Prosecutors's fallacy (https://en.wikipedia.org/wiki/Prosecutor%27s_fallacy): Using the equivalent of a p-value as a measure of evidence in a trial, ignoring the prior. See R v. Sally Clark -- Convicted on Statistics? (http://understandinguncertainty.org/node/545). Consider the parallel case of Susan Nelles (https://en.wikipedia.org/wiki/Toronto_hospital_baby_deaths) in Toronto.
• Misinterpreting the probability of disease from medical tests
See Gigerenzer. The correctness paradox. If a test for disease D has specificity = sensitivity = 95%, this means that Pr(Correct|D) = Pr(Correct| not D) = 0.95. So, necessarily Pr(Correct) = 0.95 regardless of the base rate. If you get a positive result, is it true that P(D|Pos) = P(Correct|Pos) = .95. The paradox lies in the fact that, whatever the base rate, average(Pr(Correct|Result)) = Pr(Correct|Pos) x Pr(Pos) + Pr(Correct|Neg) x Pr(Neg) = 0.95. However this does not imply that the components of the average, Pr(Correct|Pos) and Pr(Correct|Neg) are themselves equal to 0.95. Depending on the base rate, one could be much lower as long as the other is compensatingly higher.
• Strength of evidence is not strength of effect
• Well documented by Ziliak and McCloskey in The Cult of Statistical Significance.
• Absence of evidence is not evidence of absence
• This is the fallacy that consists in concluding that there no effect if evidence against the null hypothesis does not reach a conventional threshold, e.g. p < 0.05. See, for example, the Vioxx case discussed in Ziliak and McCloskey. Statistical hypothesis testing is not set up to protect again wrongly failing to reject the null. It is set up to control against the possibility of
• Has an interesting history. Donald Rumsfeld refers to this (in?)famously but uses to justify a different conclusion from that for which it is generally used. The usual interpretation is that we cannot prove, and thus maintain uncertainty about, the null hypothesis in the absence of evidence against. Note that this is generally used in the context of frequentist evidence. Rumsfeld uses it to justify the second Iraq war in the absence of evidence that Iraq had weapons of mass destruction. He does not apply the principle to avoid concluding that the null is valid but to justify concluding that it is not.

## Teaching topics

• Is buying lottery tickets a good idea?
Should go deeper than merely considering expectation.
In relation to the test of significance, we may say that a phenomenon is experimentally demonstrable when we know how to conduct an experiment which will rarely fail to give us a statistically significant result. -- Fisher 1947

## How to teach ...

### R with Markdown

• R packages from Computerworld (http://www.computerworld.com/article/2921176/business-intelligence/great-r-packages-for-data-import-wrangling-visualization.html)