Probability, Statistics, and Data A Fresh Approach Using R Chapman & Hall/CRC Texts in Statistical Science Series
Auteurs : Speegle Darrin, Clair Bryan
This book is a fresh approach to a calculus based, first course in probability and statistics, using R throughout to give a central role to data and simulation.
The book introduces probability with Monte Carlo simulation as an essential tool. Simulation makes challenging probability questions quickly accessible and easily understandable. Mathematical approaches are included, using calculus when appropriate, but are always connected to experimental computations.
Using R and simulation gives a nuanced understanding of statistical inference. The impact of departure from assumptions in statistical tests is emphasized, quantified using simulations, and demonstrated with real data. The book compares parametric and non-parametric methods through simulation, allowing for a thorough investigation of testing error and power. The text builds R skills from the outset, allowing modern methods of resampling and cross validation to be introduced along with traditional statistical techniques.
Fifty-two data sets are included in the complementary R package fosdata. Most of these data sets are from recently published papers, so that you are working with current, real data, which is often large and messy. Two central chapters use powerful tidyverse tools (dplyr, ggplot2, tidyr, stringr) to wrangle data and produce meaningful visualizations. Preliminary versions of the book have been used for five semesters at Saint Louis University, and the majority of the more than 400 exercises have been classroom tested.
The exercises in the book have been added to to the free and open online homework system myopenmath (https://www.myopenmath.com/) which may be useful to instructors.
1. Data in R. 2. Probability. 3. Discrete Random Variables. 4. Continuous Random Variables. 5. Simulation of Random Variables. 6. Data Manipulation. 7. Data Visualization with ggplot. 8. Inference on the Mean. 9. Rank Based Tests. 10. Tabular Data. 11. Simple Linear Regression. 11. Analysis of Variance and Comparison of Multiple Groups. 13. Multiple Regression.
Darrin Speegle has 25 years of experience teaching probability and statistics at Saint Louis University, where he is a Professor and the Director of Data Science. He served as the program committee chair on the organizing team for UseR!2020 in St. Louis. His research has been supported by the National Science Foundation and the Simons Foundation.
Bryan Clair is the Chair of the Mathematics and Statistics Department at Saint Louis University. His research is in topology and combinatorics. His work writing mathematics for general audiences has appeared in the New York Times, Washington Post, Math Horizons, and the SF magazine Strange Horizons.
Date de parution : 11-2021
17.8x25.4 cm
Thèmes de Probability, Statistics, and Data :
Mots-clés :
Data Set; CSV File; calculus based theory; Data Frame; data science; Random Variables; mathematical statistics; Binomial Random Variable; data wrangling; Standard Normal Random Variable; data visualization; Cumulative Distribution Function; simulations; Null Hypothesis; rstats; Prediction Interval; tidyverse; Normal Random Variable; Exponential Random Variables; Independent Uniform Random Variables; Geometric Random Variables; Young Men; Free Throw; Poisson Process; Wilcoxon Signed Rank Test; Wilcoxon Rank Sum Test; Qq Plot; Bill Depths; High Leverage Outliers; Chinstrap Penguins; Roc Curve; Benford's Law; Uniform Random Variable