Quick-R: Probability Plots


R Tutorial | R Interface | Data Input | Data Management | Statistics | Advanced Statistics | Graphs | Advanced Graphs

< Advanced Graphs Graphical Parameters Axes and Text Combining Plots Lattice Graphs ggplot2 Graphs Probability Plots

Probability Plots This section describes creating probability plots in R for both didactic purposes and for data analyses.

Probability Plots for Teaching and Demonstration When I was a college professor teaching statistics, I used to have to draw normal distributions by hand. They always came out looking like bunny rabbits. What can I say?

Mosaic Plots

R makes it easy to draw probability distributions and demonstrate statistical concepts.


Some of the more common probability distributions available in R are given below.

Interactive Graphs


R name distribution






Negative Binomial nbinom


cauchy Normal







Student t






material. Use promo code


gamma Tukey

ria38 for a 38% discount.



R in Action

R in Action (2nd ed) significantly expands upon this

R name lnorm





Hypergeometric hyper





For a comprehensive list, see Statistical Distributions on the R wiki. The functions available for each distribution follow this format:



dname( ) density or probability function pname( ) cumulative density function qname( ) quantile function Rname( ) random deviates

For example, pnorm(0) =0.5 (the area under the standard normal curve to the left of zero). qnorm(0.9) = 1.28 (1.28 is the 90th percentile of the standard normal distribution). rnorm(100) generates 100 random deviates from a standard normal distribution. Each function has parameters specific to that distribution. For example, rnorm(100, m=50, sd=10) generates 100 random deviates from a normal distribution with mean 50 and standard deviation 10. You can use these functions to demonstrate various aspects of probability distributions. Two common examples are given below.

# Display the Student's t distributions with various # degrees of freedom and compare to the normal distribution x <- seq(-4, 4, length=100) hx <- dnorm(x) degf <- c(1, 3, 8, 30) colors <- c("red", "blue", "darkgreen", "gold", "black") labels <- c("df=1", "df=3", "df=8", "df=30", "normal") plot(x, hx, type="l", lty=2, xlab="x value", ylab="Density", main="Comparison of t Distributions") for (i in 1:4){ lines(x, dt(x,degf[i]), lwd=2, col=colors[i]) } legend("topright", inset=.05, title="Distributions", labels, lwd=2, lty=c(1, 1, 1, 1, 2), col=colors)

click to view

# Children's IQ scores are normally distributed with a # mean of 100 and a standard deviation of 15. What # proportion of children are expected to have an IQ between # 80 and 120? mean=100; sd=15 lb=80; ub=120 x <- seq(-4,4,length=100)*sd + mean hx <- dnorm(x,mean,sd) plot(x, hx, type="n", xlab="IQ Values", ylab="", main="Normal Distribution", axes=FALSE) i <- x >= lb & x <= ub lines(x, hx) polygon(c(lb,x[i],ub), c(0,hx[i],0), col="red") area <- pnorm(ub, mean, sd) - pnorm(lb, mean, sd) result <- paste("P(",lb,"< IQ <",ub,") =", signif(area, digits=3)) mtext(result,3) axis(1, at=seq(40, 160, 20), pos=0)

click to view For a comprehensive view of probability plotting in R, see Vincent Zonekynd's Probability Distributions.

Fitting Distributions There are several methods of fitting distributions in R. Here are some options. You can use the qqnorm( ) function to create a Quantile-Quantile plot evaluating the fit of sample data to the normal distribution. More generally, the qqplot( ) function creates a Quantile-Quantile plot for any theoretical distribution.

# Q-Q plots par(mfrow=c(1,2)) # create sample data x <- rt(100, df=3) # normal fit qqnorm(x); qqline(x) # t(3Df) fit qqplot(rt(1000,df=3), x, main="t(3) Q-Q Plot", ylab="Sample Quantiles") abline(0,1)

click to view The fitdistr( ) function in the MASS package provides maximum-likelihood fitting of univariate distributions. The format is fitdistr(x, densityfunction) where x is the sample data and densityfunction is one of the following: "beta", "cauchy", "chi-squared", "exponential", "f", "gamma", "geometric", "log-normal", "lognormal", "logistic", "negative binomial", "normal", "Poisson", "t" or "weibull".

# Estimate parameters assuming log-Normal distribution # create some sample data x <- rlnorm(100) # estimate paramters library(MASS) fitdistr(x, "lognormal")

Finally R has a wide range of goodness of fit tests for evaluating if it is reasonable to assume that a random sample comes from a specified theoretical distribution. These include chi-square, Kolmogorov-Smirnov, and Anderson-Darling. For more details on fitting distributions, see Vito Ricci's Fitting Distributions with R. For general (non R) advice, see Bill Huber's Fitting Distributions to Data.

To Practice Try this interactive course on exploratory data analysis. Copyright © 2017 Robert I. Kabacoff, Ph.D. | Sitemap


Quick-R: Probability Plots

R Tutorial | R Interface | Data Input | Data Management | Statistics | Advanced Statistics | Graphs | Advanced Graphs < Advanced Graphs Graphical Par...

70KB Sizes 4 Downloads 18 Views

Recommend Documents

Normal Probability Plots in SPSS
Normal Probability Plots in SPSS. STAT 314. In 11 test runs a brand of harvesting machine operated for 10.1, 12.2, 12.4,

Urban Plots
importazioni. In questo caso, analisi settoriali per individuare i comparti tendenzialmente più dinamici secndo la form

What are Bode Plots?
First explain linear scale. • Disadvantage of linear scale when wide-spread data exists: Let us take an example. Consi

Untitled - SDS Plots Expressway, SDS Plots Greater Noida
... Dated 05.10.2010. - M/s SDS Infracon has vast experience in Real Estate. Development, Mainly developed following pro

Frequency Response and Bode Plots
The steady-state sinusoidal frequency-response of a circuit is described by the phasor transfer function ( ). H jω . A

4-1 Using worksheets to illustrate sample spaces. The counting rule for a two-step ..... Complementary events are always

Classical Probability 2. Examples: •. Probability of drawing a spade. •. Probability of drawing a red card. •. Pro

Probability Meeting (Probability) - MathCounts
Probability Meeting. (Probability). Topic. This meeting's topic is probability. Students will progress through problems

more boys than girls. A is therefore more girls than boys. Each of these events are equally likely, so P(A) = .50. Note

Two events are complementary if exactly one of the events must occur. If A is an event, then A. 0 is the complementary e