# Talk:MATH 1532 2009-10

## Draft

Using Rcmdr:

• Categorical variables:
• One way:
Statistics | Summaries | Frequency distributions to get counts and percentages
• Two-way:
Statistics | Contingency table to get counts or proportions and test of association

Graphs NOTE:

Graphs | Pie chart
Graphs | Barchart

NOTE:

• Index plot
• Histogram
• Stem-and-leaf display
• Boxplot
• Quantile-comparison plot
• Scatterplot
• Scatterplot matrix
• Line graph
```* XY conditioning plot
```
• Plot of means
```* Strip chart
```
• Bar graph
```* Pie chart
* 3D graph - 3D scatterplot
|          |- Identify observations with mouse
|          |- Save graph to file
|- Save graph to file - as bitmap
|- as PDF/Postscript/EPS
|- 3D RGL graph
```

#### Graphs with Rcmdr

Getting graphs with Rcmdr
one categorical variable Graphs | Bar graph
Graphs | Pie chart
two categorical variables need command line library(lattice);with(Dataset, barchart( table( Xcat, Ycat), stack = F, auto.key=T)
X cat var and Y num var Graphs | Boxplot click on Plot by groups
one numeric variable Graphs | Histogram
Graphs | Boxplot
two numeric variables Scatterplot prompts for x and y variables
X,Y num. vars & Z cat. var. Graphs | Scatterplot click on Plot by groups to choose Z
one numeric variable Graphs | Boxplot
one numeric variable Graphs | Boxplot

#### Statistics with Rcmdr

Getting numerical statistics with Rcmdr
all variables Data | Summaries | Active data sets
one categorical variable Statistics | Summaries | Frequency distributions
two categorical variables Statistics | Contingency tables | Two-way tables Choose X as Row variable and Y as Column variable
request multiple tables selecting No percentages and Column percentages
X cat var and Y num var Statistics | Means | One-way ANOVA
Statistics | Summaries | Numerical summaries
X variable is groups
one numeric variable Statistics | Means | Single-sample t-test
Statistics | Summaries | Numerical summaries
two numeric variables Statistics | Fit models
X,Y num. vars & Z cat. var. Statistics | Fit models

## Week 7, February 24

### Material covered

Chapter 5 and the beginning of Chapter 6

#### Regression with two variables with Rcmdr

Relationships between two quantitative variables with Rcmdr
Explore num. vars.
+ possibly 1 cat. var.
Graphs | Scatterplot matrix
Graphs | 3D Graphs | 3D scatterplot
Statistics | Summaries | Numerical summaries
You can also include one categorical variable by selecting "Summarize by groups" or "Plot by groups"
Scatterplot Graphs | Scatterplot
Correlation Statistics | Summaries | Correlation matrix Use correlation test for p-values
Fitting the least-squares line
i.e. the estimated linear regression equation
Statistics | Fit models | Linear regression
Models | Summarize models / Confidence intervals / Add observation statistics to data /etc.
After adding observation statistics to data you can plot residuals in various ways to whether there are patterns remaining in the residuals

### Assignment 2

Assignment 2 will done in the same groups as Assignment 1 except that groups that have become too small may be combined with others. Assignment 2 consists of the accumulated problems from week to week that are assigned over the next three weeks. The assignment is due on March 24.

Each current group should send me (mailto:georges+nats1500@yorku.ca) one email message giving me the name of the group and the names of its members. I'll address issues concerning reconstitution of groups on Sunday, Feb. 28.

### Project

More details to come. The general idea is to perform an analysis of some data that you find of interest using the statistical tools and critical insights that you have developed in the course. To help you find a topic and data you can have a look at Statistics: Pedagogical resources on this wiki.

### Exercises

Notes:

1. Numbers in 'bold' need to be done for Assignment 2.
2. The numbers shown in the text all have the form '5.x' where 'x' is the number of the question within chapter 5. In the following lists I only show 'x'.

Chapter 5, pp. 161--168:

Looking for Patterns with Scatterplots:
1, 2, 3, 7
Describing Linear Pattern with a Regression Line:
11, 14
Measuring Strength and Direction with Correlation:
24 (important -- likely to be on exam), 27 (also a good candidate for the exam)
Why the Answers May Not Make Sense & Correlation Does Not Prove Causation:
36 (refers to 7), 39, 40,
Chapter Exercises:
46, 48,49. 59, 60, 61, 62.

Chapter 6, pp. 193--201:

Displaying Relationships Between Categorical Variables:
3, 4, 6, 7,
Risk, Relative Risk, Odds Ratio and Increased Risk & Misleading Statistics About Risk:
10 (nice exam question), 11-14 (ditto), 20 (refers to 6), 22
The Effect of a Third Variable and Simpson's Paradox:
27, 29, 31
Assessing the Statistical Significance of a 2 x 2 Table:
33, 34, 43
Chapter Exercises:
56,57, 58, 62.