R辅导 | STA 104 Exam I Project
本次北美统计学辅导Computational Statistics主要是利用给定的数据完成直方图, 标准差等统计结果数据分析.
Read the following instructions carefully:
• You may work in a group of two, or by yourself.
• You are not allowed to discuss the questions with anyone other than the instructor or TA and your group mate.
• Any outside help beyond that from the instructor or TA is considered plagiarism. This including asking a tutor, your classmates (for example, comparing answers), posting the questions to homework help sites, etc. Should we believe you have sought outside help, you will be reported to the Student Judicial Aﬀairs oﬃce.
• You are allowed to use or modify your previous functions, or the instructors functions that are posted online.
• Do not share answers, or speciﬁc values for calculations, particularly on Piazza.
• You may ask clarifying questions about code and general approach on Piazza, but do not give away any numerical answers. If you are concerned you may be giving something away, email me or the TA’s directly.
1 The group (of one or two people) will select one question from each topic, for a total of two questions.
The data used for this question is Toys.csv, and it has a column Broken. What has been measured is how many toys a particularly agile cat breed (A Bengal Cat) “destroys” in a week. A frustrated owner believes that they have to buy over 3 toys per week in order to keep up with their cat. Assess this claim using the median.
The data used for this question is Play.csv, with a column Return. Border collies are dogs known for their obsessive behavior, and 16 border collie owners measured how many times in a row their dogs returned a ball to the owner after it was thrown. They claim that there is a 50% change that the dogs will return the ball 12 times or more. Assess this claim using the median.
The data is found in the ﬁle Drug.csv, with the following columns:
Column 1: Relief: The number of hours of relief provided.
Column 2: Groups: Either DrugA or DrugB.
The study was comparing the hours of pain relief for two common over-the-counter pain medications. Compare the two groups, being as speciﬁc as you can about your outcome.
The data is found in the ﬁle soil.csv, with the following columns:
Column 1: condition: gap (the soil was under an opening in the forest canopy), and growth (the soil was taken under heavy tree growth).
Column 2: respiration: The amount of carbon dioxide given oﬀ by each soil core (in mol CO2g soiohr).
Soil respiration is a measure of Microbial activity in soil, which aﬀects plant growth. Compare the two groups, being as speciﬁc as you can about your outcome.
2 3. The Report Format
Each question should be a short report. This means you write in full sentences, and have the following sections for each question, while being as speciﬁc as you can about your results. There should not be any “copy and pasted” R code in this report. You must format the results you get from R.
I. Introduction. State the question you are trying to answer, why it is a question of interest (why might we be interested in the answer), and what statistical technique you are going to use. This must be a non-parametric technique.
II. Summary of your data (and only the data you are using for the question). This should include things like plots (histograms, boxplots) including the interpretation of the plots, and summary values such as sample means and standard deviations. This is where you should justify which non-parametric technique you are using. An R handout is available online for graphing and summaries of various data types.
III. Analysis. Report back conﬁdence intervals, test-statistics, and p-values, nulls and alternatives, etc. You may use tables here, but be sure that you organize your work. Remember to write your results in full sentences where possible.
IV. Interpretation. State your conclusion, and inference that you may draw from your corresponding tests or conﬁdence intervals. These should all be in terms of your problem.
V. Conclusion. Summarize brieﬂy your ﬁndings. Here you do not have to re-iterate your numeric values, but summarize all relevant conclusions.
Your report should be the following format:
ii. A title page including your name/s, the name of the class, and the name of your instructor (me).
iii. Treat each question as a small, stand alone report. Then staple them together at the end.
iv. Double-sided pages.
v. An appendix of your R code used to produce the results. Do not include in R code in the body of your report. For example, your project should be put together in the following order (stapled):
Cover Page Parts I-V for first question Parts I-V for second question Code appendix Feel free to make your cover page “unique” so that it is easy to ﬁnd when I hand them back.
Notice: your project will be graded as a group eﬀort (if you have two people). This means that you are responsible for your own work, and your partners work. I will not assign two diﬀerent grades to one project.