Title: | Demonstrations for Teaching and Learning |
---|---|
Description: | Demonstration functions that can be used in a classroom to demonstrate statistical concepts, or on your own to better understand the concepts or the programming. |
Authors: | Greg Snow |
Maintainer: | Greg Snow <[email protected]> |
License: | Artistic-2.0 |
Version: | 2.13 |
Built: | 2024-11-13 03:43:12 UTC |
Source: | https://github.com/glsnow/teachingdemos |
This package provides various demonstrations that can be used in classes or by individuals to better learn statistical concepts and usage of R. Various utility functions are also included
Package: | TeachingDemos |
Type: | Package |
License: | Artistic-2.0 |
Demonstration functions in this package include:
ci.examp, run.ci.examp | Confidence Interval Examples |
clt.examp | Central Limit Theorem Example |
dice, plot.dice | Roll and Plot dice (possibly loaded) |
faces, faces2 | Chernoff face plots |
fagan.plot | Fagan plot for screening designs |
lattice.demo | The 3d slicing idea behind lattice/trellis graphics |
loess.demo | Interactive demo to show ideas of loess smooths |
mle.demo | Interactive demo of Maximum Likelihood Estimation |
plot.rgl.coin, plot.rgl.die | Animate flipping a coin or rolling a die |
power.examp | Demonstrate concepts of Power. |
put.points.demo | Add/move points on a plot and see the effect on correlation and regression. |
roc.demo | Interactive demo of ROC curves. |
rotate.cloud | Interactively rotate 3d plots. |
run.cor.examp | Show plots representing different correlations. |
run.hist.demo | Interactively change parameters for histograms. |
SensSpec.demo | Show relationship between Sensitivity, Specificity, Prevalence and PPV and NPV. |
TkApprox | Interactive linear interpolations of data. |
tkBrush | Brush points in a scatterplot matrix. |
TkSpline | Interactive spline interpolations of data. |
tree.demo | Interactively Recursive partition data (create trees). |
vis.binom | Plot various probability distributions and interactively change parameters. |
vis.boxcox | Interactively change lambda for Box Cox Transforms. |
z.test | Z-test similar to t.test for students who have not learned t tests yet. |
Pvalue.norm.sim | |
Pvalue.binom.sim | Simulate P-values to see how they are distributed. |
run.Pvalue.norm.sim | GUI for above. |
run.Pvalue.binom.sim | |
HWidentify | |
HTKidentify | Identify the point Hovered over with the mouse. |
vis.test | test a null hypothesis by comparing graphs. |
Utility functions include:
bct | Box-Cox Transforms. |
char2seed | set or create the random number seed using a character string |
clipplot | clip a plot to a rectangular region within the plot |
col2grey | convert colors to greyscale |
cnvrt.coords | Convert between the different coordinate systems |
dynIdentify | Scatterplot with point labels that can be dragged to a new position |
TkIdentify | Scatterplot with lables that can be dragged to new positions |
gp.plot gp.splot | send commands to gnuplot |
hpd | Highest Posterior Density intervals |
my.symbols | Create plots using user defined symbols. |
panel.my.symbols | Create lattice plots using user defined symbols. |
plot2script | Create a script file that recreates the current plot. |
shadowtext | plot text with contrasting shadow for better readability. |
squishplot | Set the margins so that a plot has a specific aspect ratio without large white space inside. |
spread.labs | Spread out coordinates so that labels do not overlap. |
subplot | create a plot inside of an existing plot. |
tkexamp | create plots that can have parameters adjusted interactively. |
triplot | Trilinear plot for 3 proportions. |
txtStart/etxtStart/wdtxtStart | Save commands and output to a text file (possibly for post processing with enscript). |
zoomplot | recreate the current plot with different x/y limits (zoom in out). |
Greg Snow [email protected]
The tkrplot package
ci.examp() clt.examp() clt.examp(5) plot.dice( expand.grid(1:6,1:6), layout=c(6,6) ) faces(rbind(1:3,5:3,3:5,5:7)) plot(1:10, 1:10) my.symbols( 1:10, 1:10, ms.polygram, n=1:10, inches=0.3 ) x <- seq(1,100) y <- rnorm(100) plot(x,y, type='b', col='blue') clipplot( lines(x,y, type='b', col='red'), ylim=c(par('usr')[3],0)) power.examp() power.examp(n=25) power.examp(alpha=0.1)
ci.examp() clt.examp() clt.examp(5) plot.dice( expand.grid(1:6,1:6), layout=c(6,6) ) faces(rbind(1:3,5:3,3:5,5:7)) plot(1:10, 1:10) my.symbols( 1:10, 1:10, ms.polygram, n=1:10, inches=0.3 ) x <- seq(1,100) y <- rnorm(100) plot(x,y, type='b', col='blue') clipplot( lines(x,y, type='b', col='red'), ylim=c(par('usr')[3],0)) power.examp() power.examp(n=25) power.examp(alpha=0.1)
Comparison operators that can be chained together into something like 0 %<% x %<% 1 instead of 0 < x && x < 1.
x %<% y x %<=% y
x %<% y x %<=% y
x , y
|
Values to compare |
These functions/operators allow chained inequalities. To specify that
you want the values between two values (say 0 and 1) you can use 0
%<% x %<% 1
rather than 0 < x \&\& x < 1
.
A logical vector is returned that can be used for subsetting like
<
, but the original values are included as attributes to be used
in additional comparisons.
This operator is not fully associative and has different precedence than
<
and <=
, so be careful with parentheses.
See the examples.
Greg Snow, [email protected]
x <- -3:3 -2 %<% x %<% 2 c( -2 %<% x %<% 2 ) x[ -2 %<% x %<% 2 ] x[ -2 %<=% x %<=% 2 ] x <- rnorm(100) y <- rnorm(100) x[ -1 %<% x %<% 1 ] range( x[ -1 %<% x %<% 1 ] ) cbind(x,y)[ -1 %<% x %<% y %<% 1, ] cbind(x,y)[ (-1 %<% x) %<% (y %<% 1), ] cbind(x,y)[ ((-1 %<% x) %<% y) %<% 1, ] cbind(x,y)[ -1 %<% (x %<% (y %<% 1)), ] cbind(x,y)[ -1 %<% (x %<% y) %<% 1, ] # oops 3 3
x <- -3:3 -2 %<% x %<% 2 c( -2 %<% x %<% 2 ) x[ -2 %<% x %<% 2 ] x[ -2 %<=% x %<=% 2 ] x <- rnorm(100) y <- rnorm(100) x[ -1 %<% x %<% 1 ] range( x[ -1 %<% x %<% 1 ] ) cbind(x,y)[ -1 %<% x %<% y %<% 1, ] cbind(x,y)[ (-1 %<% x) %<% (y %<% 1), ] cbind(x,y)[ ((-1 %<% x) %<% y) %<% 1, ] cbind(x,y)[ -1 %<% (x %<% (y %<% 1)), ] cbind(x,y)[ -1 %<% (x %<% y) %<% 1, ] # oops 3 3
Computes the Box-Cox transform of the data for a given value of lambda. Includes the scaling factor.
bct(y, lambda)
bct(y, lambda)
y |
Vector of data to be transformed. |
lambda |
Scalar exponent for transform (1 is linear, 0 is log). |
bct
computes the Box-Cox family of transforms: y = (y^lambda -
1)/(lambda*gm^(lambda-1)), where gm is the geometric mean of the y's.
returns log(y)*gm when lambda equals 0.
A vector of the same length as y with the corresponding transformed values.
Greg Snow [email protected]
vis.boxcox
, vis.boxcoxu
,
boxcox
in package MASS, other implementations in various
packages
y <- rlnorm(500, 3, 2) par(mfrow=c(2,2)) qqnorm(y) qqnorm(bct(y,1/2)) qqnorm(bct(y,0)) hist(bct(y,0))
y <- rlnorm(500, 3, 2) par(mfrow=c(2,2)) qqnorm(y) qqnorm(bct(y,1/2)) qqnorm(bct(y,0)) hist(bct(y,0))
Plot a calendar of the specified year or month. Monthly calendars can have additional information (text/plots) added to the individual cells.
cal(month, year)
cal(month, year)
month |
The month for the calendar, if ommitted will do a yearly calendar,
can either be a number from 1 to 12 or a character string that will
be matched (using |
year |
The year for the calendar. If ommitted and |
This function plots on the current (or default) graphics device a
yearly or monthly calendar. It tries to guess what you want, if both
year
and month
are ommitted then it will plot the
current month. If month
is an integer greater than 12 and no
year
is specified then
that value will be used as the year for a yearly calendar. The
month
can be either an integer from 1 to 12 or a character
string that will be matched against month.name
using
pmatch
.
Each day of the monthly calendar is a plotting frame that can be
added to using stardard low level functions, the coordinates of the
plotting region (the entire box) are from 0 to 1 in both
dimensions. The updateusr
function can be used to
change the coordinates. The return from the function (when creating
a monthly calendar) can be used to select the day.
Nothing is returned when a whole year calendar is created. When the month calendar is created a function is returned invisibly that if passed an integer corresponding to a day of the month will set the graphics parameters so the corresponding day in the calendar becomes the current plotting figure. See the examples below.
Greg Snow, [email protected]
Sys.time
, as.POSIXlt
, par
, updateusr
cal(2011) cal('May') setday <- cal(11, 2011) setday(3) text(0.5,0.5, 'Some\nCentered\nText') setday(8) text(1,1,'Top Right',adj=c(1,1)) setday(18) text(0,0,'Bottom Left', adj=c(0,0) ) setday(21) tmp.x <- runif(25) tmp.y <- rnorm(25, tmp.x, .1) mrgn.x <- 0.04*diff(range(tmp.x)) mrgn.y <- 0.04*diff(range(tmp.y)) updateusr( 0:1, 0:1, range(tmp.x)+c(-1,1)*mrgn.x, range(tmp.y)+c(-1,1)*mrgn.y) points(tmp.x, tmp.y) setday(30) tmp <- hist(rnorm(100), plot=FALSE) updateusr( 0:1, 0:1, range(tmp$breaks), range(tmp$counts*1.1,0) ) lines(tmp)
cal(2011) cal('May') setday <- cal(11, 2011) setday(3) text(0.5,0.5, 'Some\nCentered\nText') setday(8) text(1,1,'Top Right',adj=c(1,1)) setday(18) text(0,0,'Bottom Left', adj=c(0,0) ) setday(21) tmp.x <- runif(25) tmp.y <- rnorm(25, tmp.x, .1) mrgn.x <- 0.04*diff(range(tmp.x)) mrgn.y <- 0.04*diff(range(tmp.y)) updateusr( 0:1, 0:1, range(tmp.x)+c(-1,1)*mrgn.x, range(tmp.y)+c(-1,1)*mrgn.y) points(tmp.x, tmp.y) setday(30) tmp <- hist(rnorm(100), plot=FALSE) updateusr( 0:1, 0:1, range(tmp$breaks), range(tmp$counts*1.1,0) ) lines(tmp)
This function creates a seed for the random number generator from a character string. Character strings can be based on student names so that every student has a different random sample, but the teacher can generate the same datasets.
char2seed(x, set = TRUE, ...)
char2seed(x, set = TRUE, ...)
x |
A character string |
set |
Logical, should the seed be set or just returned |
... |
Additional parameters passed on to |
Simulations or other situations call for the need to have repeatable random numbers, it is easier to remember a word or string than a number, so this function converts words or character strings to an integer and optionally sets the seed based on this.
Teachers can assign students to generate a random dataset using their name to seed the rng, this way each student will have a different dataset, but the teacher can generate the same set of data to check values.
Any characters other than letters (a-zA-Z) or digits (0-9) will be silently removed. This function is not case sensitive, so "ABC" and "abc" will generate the same seed.
This is a many to one function, so it is possible to find different words that generate the same seed, but this is unlikely by chance alone.
This returns an integer (but mode numeric) to use as a seed for the
RNG. If set
is true then it is returned invisibly.
Greg Snow [email protected]
char2seed('Snow') x <- rnorm(100) rnorm(10) tmp <- char2seed('Snow',set=FALSE) set.seed(tmp) y <- rnorm(100) all.equal(x,y) # should be true
char2seed('Snow') x <- rnorm(100) rnorm(10) tmp <- char2seed('Snow',set=FALSE) set.seed(tmp) y <- rnorm(100) all.equal(x,y) # should be true
Prints out the details of the computations involved in a chi-squared test on a table. Includes the expected values and the chi-squared contribution of each cell.
chisq.detail(tab)
chisq.detail(tab)
tab |
Matrix or table to be analyzed |
This function prints out the input table along with the expected value for each cell under the null hypothesis. It also prints out the chi-squared contribution of each cell in the same pattern as the table. This shows the computations involved and one rule of thumb is to look for these values that are greater than 4 as a post-hoc analysis.
This function is used primarily for its side effect of printing the results, but does return invisibly a list with the following components:
obs |
A matrix of the observed values, same as tab. |
expected |
A matrix of the expected values under the null hypothesis. |
chi.table |
A matrix of the chi-squared contributions of each cell. |
chi2 |
The chi-squared test statistic. |
Greg Snow, [email protected]
~put references to the literature/web site here ~ Moore, bps
chisq.test
,loglin
,
xtabs
, table
, prop.table
,
CrossTable
from the gmodels package.
chisq.detail(HairEyeColor[,,1]) chisq.detail(HairEyeColor[,,2])
chisq.detail(HairEyeColor[,,1]) chisq.detail(HairEyeColor[,,2])
Generate reps
samples from a normal distribution then compute
and plot confidence intervals for each sample along with information
about the population to demonstrate confidence intervals. Optionally
change the confidence level using a Tk slider.
ci.examp(mean.sim = 100, sd = 10, n = 25, reps = 50, conf.level = 0.95, method = "z", lower.conf = (1 - conf.level)/2, upper.conf = 1 - (1 - conf.level)/2) run.ci.examp(reps = 100, seed, method="z", n=25)
ci.examp(mean.sim = 100, sd = 10, n = 25, reps = 50, conf.level = 0.95, method = "z", lower.conf = (1 - conf.level)/2, upper.conf = 1 - (1 - conf.level)/2) run.ci.examp(reps = 100, seed, method="z", n=25)
mean.sim |
The mean of the population. |
sd |
The standard deviation of the population. |
n |
The sample size for each sample. |
reps |
The number of samples/intervals to create. |
conf.level |
The confidence level of the intervals. |
method |
'z', 't', or 'both', should the intervals be based on the normal, the t, or both distributions. |
lower.conf |
Quantile for lower confidence bound. |
upper.conf |
Quantile for upper confidence bound. |
seed |
The seed to use for the random number generation. |
These functions demonstrate the concept of confidence intervals by taking multiple samples from a known normal distribution and calculating a confidence interval for each sample and plotting the interval relative to the true mean. Intervals that contain the true mean will be plotted in black and those that do not include the true mean will be plotted in different colors.
The method
argument determines the type of interval: 'z' will
use the normal distribution and the known population standard
deviation, 't' will use the t distribution and the sample standard
deviations, 'both' will compute both for each sample for easy
comparison (it is best to reduce reps
to about 25 when using
'both').
The optional arguments lower.conf
and upper.conf
can be
used to plot non-symmetric or 1 sided confidence intervals.
The function run.ci.examp
also creates a Tk slider that will
allow you to interactively change the confidence level and replot the
intervals to show how the interval widths change with the confidence
level.
These functions are run solely for the side effect of plotting the intervals, there is no meaningfull return value.
Greg Snow [email protected]
ci.examp() if(interactive()) { run.ci.examp() } # 1 sided confidence intervals ci.examp(lower.conf=0, upper.conf=0.95) # non-symmetric intervals ci.examp(lower.conf=0.02, upper.conf=0.97)
ci.examp() if(interactive()) { run.ci.examp() } # 1 sided confidence intervals ci.examp(lower.conf=0, upper.conf=0.95) # non-symmetric intervals ci.examp(lower.conf=0.02, upper.conf=0.97)
Clip plotting to a rectangular region that is a subset of the plotting area
clipplot(fun, xlim = par("usr")[1:2], ylim = par("usr")[3:4])
clipplot(fun, xlim = par("usr")[1:2], ylim = par("usr")[3:4])
fun |
The function or expression to do the plotting. |
xlim |
A vector of length 2 representing the x-limits to clip plotting to, defaults to the entire width of the plotting region. |
ylim |
A vector of length 2 representing the y-limits to clip the plot to, defaults to the entire height of the plotting region. |
This function resets the active region for plotting to a rectangle within the plotting area and turns on clipping so that any points, lines, etc. that are outside the rectange are not plotted.
A side effect of this function is a call to the box()
command,
it is called with a fully transparent color so if your graphics device
honors transparency then you will probably see no effect.
Nothing meaningful is returned
This function abuses some of the intent of what par(plt=...) is
supposed to mean. In R2.7.0 and beyond there is a new funcntion
clip
with the intended purpose of doing this in a more proper
manner (however as of my last test it is not working perfectly either,
so clipplot
will remain undepricated for now).
It uses some hacks to make sure that the clipping region is set, but it does this by plotting some tranparent boxes, therefore you should not use this on devices where tranparency is not supported (or you may see extra boxes).
Greg Snow [email protected]
par
, lines
, clip
in R2.7.0
and later
x <- seq(1,100) y <- rnorm(100) plot(x,y, type='b', col='blue') clipplot( lines(x,y, type='b', col='red'), ylim=c(par('usr')[3],0)) attach(iris) tmp <- c('red','green','blue') names(tmp) <- levels(Species) plot(Petal.Width,Petal.Length, col=tmp[Species]) for(s in levels(Species)){ clipplot( abline( lm(Petal.Length~Petal.Width, data=iris, subset=Species==s), col=tmp[s]), xlim=range(Petal.Width[Species==s])) } detach(iris)
x <- seq(1,100) y <- rnorm(100) plot(x,y, type='b', col='blue') clipplot( lines(x,y, type='b', col='red'), ylim=c(par('usr')[3],0)) attach(iris) tmp <- c('red','green','blue') names(tmp) <- levels(Species) plot(Petal.Width,Petal.Length, col=tmp[Species]) for(s in levels(Species)){ clipplot( abline( lm(Petal.Length~Petal.Width, data=iris, subset=Species==s), col=tmp[s]), xlim=range(Petal.Width[Species==s])) } detach(iris)
Takes samples of size n
from 4 different distributions and
plots histograms of the means along with a normal curve with matching
mean and standard deviation. Creating the plots for different values
of n
demonstrates the Central Limit Theorem.
clt.examp(n = 1, reps = 10000, nclass = 16, norm.param=list(mean=0,sd=1), gamma.param=list(shape=1, rate=1/3), unif.param=list(min=0,max=1), beta.param=list(shape1=0.35, shape2=0.25))
clt.examp(n = 1, reps = 10000, nclass = 16, norm.param=list(mean=0,sd=1), gamma.param=list(shape=1, rate=1/3), unif.param=list(min=0,max=1), beta.param=list(shape1=0.35, shape2=0.25))
n |
size of the individual samples |
reps |
number of samples to take from each distribution |
nclass |
number of bars in the histograms |
norm.param |
List with parameters passed to |
gamma.param |
List with parameters passed to |
unif.param |
List with parameters passed to |
beta.param |
List with parameters passed to |
The 4 distributions sampled from are a Normal with defaults mean 0 and standard deviation 1, a gamma with defaults shape 1 (exponential) and lambda 1/3 (mean = 3), a uniform distribution from 0 to 1 (default), and a beta distribution with default alpha 0.35 and beta 0.25 (U shaped left skewed).
The norm.param
, gamma.param
, unif.param
, and
beta.param
arguments can be used to change the parameters of
the generating distributions.
Running the function with n
=1 will show the populations. Run
the function again with n
at higher values to show that the
sampling distribution of the uniform quickly becomes normal and the
exponential and beta distributions eventually become normal (but much
slower than the uniform).
This function is run for its side effect of creating plots. It returns NULL invisibly.
Greg Snow [email protected]
clt.examp() clt.examp(5) clt.examp(30) clt.examp(50)
clt.examp() clt.examp(5) clt.examp(30) clt.examp(50)
Takes a set of coordinates in any of the 5 coordinate systems (usr, plt, fig, dev, or tdev) and returns the same points in all 5 coordinate systems.
cnvrt.coords(x, y = NULL, input = c("usr", "plt", "fig", "dev","tdev"))
cnvrt.coords(x, y = NULL, input = c("usr", "plt", "fig", "dev","tdev"))
x |
Vector, Matrix, or list of x coordinates (or x and y coordinates), NA's allowed. |
y |
y coordinates (if |
input |
Character scalar indicating the coordinate system of the input points. |
Every plot has 5 coordinate systems:
usr (User): the coordinate system of the data, this is shown by the tick marks and axis labels.
plt (Plot): Plot area, coordinates range from 0 to 1 with 0 corresponding to the x and y axes and 1 corresponding to the top and right of the plot area. Margins of the plot correspond to plot coordinates less than 0 or greater than 1.
fig (Figure): Figure area, coordinates range from 0 to 1 with 0 corresponding to the bottom and left edges of the figure (including margins, label areas) and 1 corresponds to the top and right edges. fig and dev coordinates will be identical if there is only 1 figure area on the device (layout, mfrow, or mfcol has not been used).
dev (Device): Device area, coordinates range from 0 to 1 with 0 corresponding to the bottom and left of the device region within the outer margins and 1 is the top and right of the region withing the outer margins. If the outer margins are all set to 0 then tdev and dev should be identical.
tdev (Total Device): Total Device area, coordinates range from 0 to 1 with 0 corresponding to the bottom and left edges of the device (piece of paper, window on screen) and 1 corresponds to the top and right edges.
A list with 5 components, each component is a list with vectors named x and y. The 5 sublists are:
usr |
The coordinates of the input points in usr (User) coordinates. |
plt |
The coordinates of the input points in plt (Plot) coordinates. |
fig |
The coordinates of the input points in fig (Figure) coordinates. |
dev |
The coordinates of the input points in dev (Device) coordinates. |
tdev |
The coordinates of the input points in tdev (Total Device) coordinates. |
You must provide both x and y, but one of them may be NA
.
This function is now depricated with the new functions
grconvertX
and grconvertY
in R version 2.7.0 and beyond.
These new functions use the correct coordinate system names and have
more coordinate systems available, you should start using them instead.
Greg Snow [email protected]
par
specifically 'usr','plt', and 'fig'. Also
'xpd' for plotting outside of the plotting region and 'mfrow' and
'mfcol' for multi figure plotting. subplot
,
grconvertX
and grconvertY
in R2.7.0 and later
old.par <- par(no.readonly=TRUE) par(mfrow=c(2,2),xpd=NA) # generate some sample data tmp.x <- rnorm(25, 10, 2) tmp.y <- rnorm(25, 50, 10) tmp.z <- rnorm(25, 0, 1) plot( tmp.x, tmp.y) # draw a diagonal line across the plot area tmp1 <- cnvrt.coords( c(0,1), c(0,1), input='plt' ) lines(tmp1$usr, col='blue') # draw a diagonal line accross figure region tmp2 <- cnvrt.coords( c(0,1), c(1,0), input='fig') lines(tmp2$usr, col='red') # save coordinate of point 1 and y value near top of plot for future plots tmp.point1 <- cnvrt.coords(tmp.x[1], tmp.y[1]) tmp.range1 <- cnvrt.coords(NA, 0.98, input='plt') # make a second plot and draw a line linking point 1 in each plot plot(tmp.y, tmp.z) tmp.point2 <- cnvrt.coords( tmp.point1$dev, input='dev' ) arrows( tmp.y[1], tmp.z[1], tmp.point2$usr$x, tmp.point2$usr$y, col='green') # draw another plot and add rectangle showing same range in 2 plots plot(tmp.x, tmp.z) tmp.range2 <- cnvrt.coords(NA, 0.02, input='plt') tmp.range3 <- cnvrt.coords(NA, tmp.range1$dev$y, input='dev') rect( 9, tmp.range2$usr$y, 11, tmp.range3$usr$y, border='yellow') # put a label just to the right of the plot and # near the top of the figure region. text( cnvrt.coords(1.05, NA, input='plt')$usr$x, cnvrt.coords(NA, 0.75, input='fig')$usr$y, "Label", adj=0) par(mfrow=c(1,1)) ## create a subplot within another plot (see also subplot) plot(1:10, 1:10) tmp <- cnvrt.coords( c( 1, 4, 6, 9), c(6, 9, 1, 4) ) par(plt = c(tmp$dev$x[1:2], tmp$dev$y[1:2]), new=TRUE) hist(rnorm(100)) par(fig = c(tmp$dev$x[3:4], tmp$dev$y[3:4]), new=TRUE) hist(rnorm(100)) par(old.par)
old.par <- par(no.readonly=TRUE) par(mfrow=c(2,2),xpd=NA) # generate some sample data tmp.x <- rnorm(25, 10, 2) tmp.y <- rnorm(25, 50, 10) tmp.z <- rnorm(25, 0, 1) plot( tmp.x, tmp.y) # draw a diagonal line across the plot area tmp1 <- cnvrt.coords( c(0,1), c(0,1), input='plt' ) lines(tmp1$usr, col='blue') # draw a diagonal line accross figure region tmp2 <- cnvrt.coords( c(0,1), c(1,0), input='fig') lines(tmp2$usr, col='red') # save coordinate of point 1 and y value near top of plot for future plots tmp.point1 <- cnvrt.coords(tmp.x[1], tmp.y[1]) tmp.range1 <- cnvrt.coords(NA, 0.98, input='plt') # make a second plot and draw a line linking point 1 in each plot plot(tmp.y, tmp.z) tmp.point2 <- cnvrt.coords( tmp.point1$dev, input='dev' ) arrows( tmp.y[1], tmp.z[1], tmp.point2$usr$x, tmp.point2$usr$y, col='green') # draw another plot and add rectangle showing same range in 2 plots plot(tmp.x, tmp.z) tmp.range2 <- cnvrt.coords(NA, 0.02, input='plt') tmp.range3 <- cnvrt.coords(NA, tmp.range1$dev$y, input='dev') rect( 9, tmp.range2$usr$y, 11, tmp.range3$usr$y, border='yellow') # put a label just to the right of the plot and # near the top of the figure region. text( cnvrt.coords(1.05, NA, input='plt')$usr$x, cnvrt.coords(NA, 0.75, input='fig')$usr$y, "Label", adj=0) par(mfrow=c(1,1)) ## create a subplot within another plot (see also subplot) plot(1:10, 1:10) tmp <- cnvrt.coords( c( 1, 4, 6, 9), c(6, 9, 1, 4) ) par(plt = c(tmp$dev$x[1:2], tmp$dev$y[1:2]), new=TRUE) hist(rnorm(100)) par(fig = c(tmp$dev$x[3:4], tmp$dev$y[3:4]), new=TRUE) hist(rnorm(100)) par(old.par)
This is a list of matricies where each matrix represents a design for drawing lines on the face of a coin.
data(coin.faces)
data(coin.faces)
The format is: List of 4 $ qh: num [1:57, 1:2] 0.387 0.443 0.515 0.606 0.666 ... $ qt: num [1:62, 1:2] 0.862 0.873 0.875 0.857 0.797 ... $ H : num [1:28, 1:2] 0.503 0.506 0.548 0.548 0.500 ... $ T : num [1:18, 1:2] 0.506 0.520 0.569 0.626 0.626 ...
The current options are a capitol "H", a capitol "T", a design representing George Washingtons head traced from the heads of a US quarter, and a design representing an eagle traced from the tails of a US quarter.
The tracings here have pretty much exhausted my artistic ability, if you can do better, please do, I will be happy to include it in future versions. It would also be nice to include some designs representing faces of non-US coins, please submit your contributions (the design should fit within a circle inscribed within the unit square).
## Not run: plot.rgl.coin(heads=coin.faces$H, tails=coin.faces$T) ## End(Not run)
## Not run: plot.rgl.coin(heads=coin.faces$H, tails=coin.faces$T) ## End(Not run)
Convert colors to grey/grayscale so that you can see how your plot will look after photocopying or printing to a non-color printer.
col2grey(cols) col2gray(cols)
col2grey(cols) col2gray(cols)
cols |
Colors to convert. |
converts colors to greyscale using the formula grey=0.3*red + 0.59*green + 0.11*blue. This allows you to see how your color plot will approximately look when printed on a non-color printer or photocopied.
A vector of colors (greys) corresponding to the input colors.
Greg Snow [email protected]
grey
, col2rgb
, dichromat package
par(mfcol=c(2,2)) tmp <- 1:3 names(tmp) <- c('red','green','blue') barplot( tmp, col=c('red','green','blue') ) barplot( tmp, col=col2gray( c('red','green','blue') ) ) barplot( tmp, col=c('red','#008100','#3636ff') ) barplot( tmp, col=col2grey( c('red','#008100','#3636ff') ) )
par(mfcol=c(2,2)) tmp <- 1:3 names(tmp) <- c('red','green','blue') barplot( tmp, col=c('red','green','blue') ) barplot( tmp, col=col2gray( c('red','green','blue') ) ) barplot( tmp, col=c('red','#008100','#3636ff') ) barplot( tmp, col=col2grey( c('red','#008100','#3636ff') ) )
This function creates a scatterplot of the data, then adds colored rectangles between the points and the mean of x and y to represent the idea of the correlation coefficient.
cor.rect.plot(x, y, corr = TRUE, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), col = c("#ff000055", "#0000ff55"), ...)
cor.rect.plot(x, y, corr = TRUE, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), col = c("#ff000055", "#0000ff55"), ...)
x |
The |
y |
The |
corr |
Should the standardized axes (right and top) show the values divided by the standard deviation (TRUE, which shows correlation ideas) or not (FALSE, shows covariance idea) |
xlab |
The label for the |
ylab |
The label for the |
col |
A vector of length 2 with the colors to use for the fill of the rectangles, the 1st value will be used for "positive" rectangles and the 2nd value will be used for the "negative" rectangles. |
... |
Possible further arguments, currently ignored |
This will create a scatterplot of the data, draw refrence lines at
the mean of x
and the mean of y
, then draw rectangles
from the mean point to the data points. The right and top axes will
show the centered (and possibly scaled if corr=TRUE
) values.
The idea is that the correlation/covariance is based on summing the area of the "positive" rectangles and subtracting the sum of the areas of the "negative" rectangles (then dividing by n-1). If the positive and negative areas are about the same then the correlation/covariance is near 0, if there is more area in the positive rectangles then the correlation/covariance will be positive.
This function returns an invisible NULL, it is run for its side effects.
Greg Snow, [email protected]
## low correlation x <- rnorm(25) y <- rnorm(25) cor(x,y) cor.rect.plot(x,y) ## Positive correlation x <- rnorm(25) y <- x + rnorm(25,3, .5) cor(x,y) cor.rect.plot(x,y) ## negative correlation x <- rnorm(25) y <- rnorm(25,10,1.5) - x cor(x,y) cor.rect.plot(x,y) ## zero correlation but a definite relationship x <- -5:5 y <- x^2 cor(x,y) cor.rect.plot(x,y)
## low correlation x <- rnorm(25) y <- rnorm(25) cor(x,y) cor.rect.plot(x,y) ## Positive correlation x <- rnorm(25) y <- x + rnorm(25,3, .5) cor(x,y) cor.rect.plot(x,y) ## negative correlation x <- rnorm(25) y <- rnorm(25,10,1.5) - x cor(x,y) cor.rect.plot(x,y) ## zero correlation but a definite relationship x <- -5:5 y <- x^2 cor(x,y) cor.rect.plot(x,y)
Simulate and optionally plot rolls of dice.
dice(rolls = 1, ndice = 2, sides = 6, plot.it = FALSE, load = rep(1, sides)) ## S3 method for class 'dice' plot(x, ...)
dice(rolls = 1, ndice = 2, sides = 6, plot.it = FALSE, load = rep(1, sides)) ## S3 method for class 'dice' plot(x, ...)
rolls |
Scalar, the number of times to roll the dice. |
ndice |
Scalar, the number of dice to roll each time. |
sides |
Scalar, the number of sides per die. |
plot.it |
Logical, Should the results be plotted. |
load |
Vector of length |
x |
Data frame, return value from |
... |
Additional arguments passed to lattice plotting function. |
Simulates the rolling of dice. By default it will roll 2 dice 1 time
and the dice will be fair. Internally the sample
function is
used and the load option is passed to sample. load
is not
required to sum to 1, but the elements will be divided by the sum of
all the values.
A data frame with rolls
rows and ndice
columns
representing the results from rolling the dice.
If only 1 die is rolled, then the return value will be a vector.
If plot.it
is TRUE, then the return value will be invisible.
If the plot function is used or if plot.it
is TRUE, then a
plot will be created on the current graphics device.
Greg Snow [email protected]
# 10 rolls of 4 fair dice dice(10,4, plot.it=TRUE) # or plot(dice(10,4)) # or tmp <- dice(10,4) plot(tmp) # a loaded die table(tmp <- dice(100,1,plot.it=TRUE, load=6:1 ) ) colMeans(tmp) # Efron's dice ed <- list( rep( c(4,0), c(4,2) ), rep(3,6), rep( c(6,2), c(2,4) ), rep( c(5,1), c(3,3) ) ) tmp <- dice( 10000, ndice=4 ) ed.out <- sapply(1:4, function(i) ed[[i]][ tmp[[i]] ] ) mean(ed.out[,1] > ed.out[,2]) mean(ed.out[,2] > ed.out[,3]) mean(ed.out[,3] > ed.out[,4]) mean(ed.out[,4] > ed.out[,1]) ## redo De Mere's question demere1 <- dice(10000,4) demere2 <- dice(10000,24,sides=36) mean(apply( demere1, 1, function(x) 6 %in% x )) mean(apply( demere2, 1, function(x) 36 %in% x)) plot(demere1[1:10,]) ## plot all possible combinations of 2 dice plot.dice( expand.grid(1:6,1:6), layout=c(6,6) )
# 10 rolls of 4 fair dice dice(10,4, plot.it=TRUE) # or plot(dice(10,4)) # or tmp <- dice(10,4) plot(tmp) # a loaded die table(tmp <- dice(100,1,plot.it=TRUE, load=6:1 ) ) colMeans(tmp) # Efron's dice ed <- list( rep( c(4,0), c(4,2) ), rep(3,6), rep( c(6,2), c(2,4) ), rep( c(5,1), c(3,3) ) ) tmp <- dice( 10000, ndice=4 ) ed.out <- sapply(1:4, function(i) ed[[i]][ tmp[[i]] ] ) mean(ed.out[,1] > ed.out[,2]) mean(ed.out[,2] > ed.out[,3]) mean(ed.out[,3] > ed.out[,4]) mean(ed.out[,4] > ed.out[,1]) ## redo De Mere's question demere1 <- dice(10000,4) demere2 <- dice(10000,24,sides=36) mean(apply( demere1, 1, function(x) 6 %in% x )) mean(apply( demere2, 1, function(x) 36 %in% x)) plot(demere1[1:10,]) ## plot all possible combinations of 2 dice plot.dice( expand.grid(1:6,1:6), layout=c(6,6) )
Takes an integer or vector of integers and returns a vector, list, or matrix of the individual digits (decimal) that make up that number.
digits(x, n = NULL, simplify = FALSE)
digits(x, n = NULL, simplify = FALSE)
x |
An integer or vector of integers (if not integer, the fractional part will be ignored) |
n |
The number of digits to be returned, this can be used to place 0's
in front of smaller numbers. If this is less than the number of
digits then the last |
simplify |
Should |
This function transforms an integer (or real ignoring the fractional part) into the decimal digits that make of the decimal representation of the number using modular mathematics rather than converting to character, splitting the string, and converting back to numeric.
If x
is of length 1 then a vector of the digits is returned.
If x
is a vector and simplify
is FALSE
then a list
of vectors is returned, one element for each element of x
.
If x
is a vector and simplify
is TRUE
then a
matrix with 1 column for each element of x
.
Greg Snow [email protected]
digits( 12345 ) digits( 567, n=5 ) x <- c(1, 23, 456, 7890) digits(x) digits(x, simplify=TRUE)
digits( 12345 ) digits( 567, n=5 ) x <- c(1, 23, 456, 7890) digits(x) digits(x, simplify=TRUE)
Create a quick dotchart of 1 or 2 datasets. These dotcharts are a poor man's histogram, not the trellis dotplot.
dots(x,...) dots2(x, y, colx = "green", coly = "blue", lab1 = deparse(substitute(x)), lab2 = deparse(substitute(y)),...)
dots(x,...) dots2(x, y, colx = "green", coly = "blue", lab1 = deparse(substitute(x)), lab2 = deparse(substitute(y)),...)
x |
Vector, data to be plotted (should be rounded). |
y |
Vector, second dataset to be plotted. |
colx |
Color of points for |
coly |
Color of points for |
lab1 |
Label for |
lab2 |
Label for |
... |
Additional arguments passed to plotting functions. |
These functions create basic dotcharts that are quick "back of the envelope" approximations to histograms. Mainly intended for demonstration.
No meaninful value. These functions are run for the side effect of creating a plot.
Greg Snow [email protected]
dotplot
in the lattice package, hist
dots( round( rnorm(50, 10,3) ) ) dots2( round( rnorm(20, 10,3) ), round(rnorm(20,12,2)) )
dots( round( rnorm(50, 10,3) ) ) dots2( round( rnorm(20, 10,3) ), round(rnorm(20,12,2)) )
These functions create a scatterplot of your points and place labels for the points on them. You can then use the mouse to click and drag the labels to new positions with a line stretching between the point and label.
dynIdentify(x, y, labels = seq_along(x), corners = cbind(c(-1, 0, 1, -1, 1, -1, 0, 1), c(1, 1, 1, 0, 0, -1, -1, -1)), ...) TkIdentify(x, y, labels=seq_along(x), hscale=1.75, vscale=1.75, corners = cbind( c(-1,0,1,-1,1,-1,0,1), c(1,1,1,0,0,-1,-1,-1) ),...)
dynIdentify(x, y, labels = seq_along(x), corners = cbind(c(-1, 0, 1, -1, 1, -1, 0, 1), c(1, 1, 1, 0, 0, -1, -1, -1)), ...) TkIdentify(x, y, labels=seq_along(x), hscale=1.75, vscale=1.75, corners = cbind( c(-1,0,1,-1,1,-1,0,1), c(1,1,1,0,0,-1,-1,-1) ),...)
x |
x-values to plot |
y |
y-values to plot |
labels |
Labels for the points, defaults to a sequence of integers |
corners |
2 column matrix of locations where the line can attach to the label, see below |
hscale , vscale
|
Scaling passed to tkrplot |
... |
Additional parameters passed to |
These functions create a scatterplot of the x and y points with the
labels (from the argument above) plotted on top. You can then use the
mouse to click and drag the labels to new locations. The Tk version
shows the labels being dragged, dynIdentify
does not show the
labels being dragged, but the label will jump to the new location as
soon as you release the mouse button.
The corners
argument is a 2 column matrix that gives the
allowable points at which the line from the point can attach to the
label (so the line does not cover thelabel). The first column
represents the x-coordinates and the 2nd column the y-coordinates. A
1 represents the right/top of the label, A -1 is the left/bottom and a
0 is the center. The default values allow attachments at the 4
corners and the centers of the 4 sides of the rectangle bounding the label.
A list of lists with the coordinates of the final positions of the labels and the line ends.
The dynIdentify
function only works on windows,
TkIdentify
should work on any platform with tcltk.
Greg Snow, [email protected]
if(interactive()) { tmp <- TkIdentify(state.x77[,'Frost'], state.x77[,'Murder'], state.abb) ### now move the labels ### recreate the graph on the current device plot( state.x77[,'Frost'], state.x77[,'Murder'], xlab='Frost', ylab='Frost') text( tmp$labels$x, tmp$labels$y, state.abb ) segments( state.x77[,'Frost'], state.x77[,'Murder'], tmp$lineends$x, tmp$lineends$y ) }
if(interactive()) { tmp <- TkIdentify(state.x77[,'Frost'], state.x77[,'Murder'], state.abb) ### now move the labels ### recreate the graph on the current device plot( state.x77[,'Frost'], state.x77[,'Murder'], xlab='Frost', ylab='Frost') text( tmp$labels$x, tmp$labels$y, state.abb ) segments( state.x77[,'Frost'], state.x77[,'Murder'], tmp$lineends$x, tmp$lineends$y ) }
Data from 46 consecutive days on weather variables used to estimate amount of evaporation from the soil.
data(evap)
data(evap)
A data frame with 46 observations on the following 14 variables.
Obs
Observation number
Month
Month (6-June, 7-July)
day
Day of the month
MaxST
Maximum Soil Temperature
MinST
Minimum Soil Temperature
AvST
Average (integrated) Soil Temperature
MaxAT
Maximum Air Temperature
MinAT
Minimum Air Temperature
AvAT
Average (integrated) Air Temperature
MaxH
Maximum Relative Humidity
MinH
Minimum Relative Humidity
AvH
Average (integrated) Relative Humidity
Wind
Total Wind
Evap
Total evoporation from the soil
The idea of the data is to predict the amount of evaporation given the other variables. Note that the "average" values are scaled differently from the others, this is more an area under the curve measure representing the total/average value.
This dataset was entered by hand from a low quality copy of the paper. If you find any typos, please e-mail them to the package maintainer.
Freund, R.J. (1979) Multicollinearity etc., Some "New" Examples. Proceedings of the Statistical Computing Section, *4*, 111-112.
data(evap) pairs(evap[,-c(1,2,3)], panel=panel.smooth) ## maybe str(evap) ; plot(evap) ...
data(evap) pairs(evap[,-c(1,2,3)], panel=panel.smooth) ## maybe str(evap) ; plot(evap) ...
faces represent the rows of a data matrix by faces
faces(xy, which.row, fill = FALSE, nrow, ncol, scale = TRUE, byrow = FALSE, main, labels)
faces(xy, which.row, fill = FALSE, nrow, ncol, scale = TRUE, byrow = FALSE, main, labels)
xy |
|
which.row |
defines a permutation of the rows of the input matrix |
fill |
|
nrow |
number of columns of faces on graphics device |
ncol |
number of rows of faces |
scale |
|
byrow |
|
main |
title |
labels |
character strings to use as names for the faces |
The features paramters of this implementation are:
1-height of face,
2-width of face,
3-shape of face,
4-height of mouth,
5-width of mouth,
6-curve of smile,
7-height of eyes,
8-width of eyes,
9-height of hair,
10-width of hair,
11-styling of hair,
12-height of nose,
13-width of nose,
14-width of ears,
15-height of ears. For details look at the literate program of faces
a plot of faces is created on the graphics device, no numerical results
version 12/2003
H. P. Wolf
Chernoff, H. (1973): The use of faces to represent statistiscal assoziation, JASA, 68, pp 361–368. The smooth curves are computed by an algorithm found in Ralston, A. and Rabinowitz, P. (1985): A first course in numerical analysis, McGraw-Hill, pp 76ff. http://www.uni-bielefeld.de/fakultaeten/wirtschaftswissenschaften : S/R - functions : faces
—
faces(rbind(1:3,5:3,3:5,5:7)) data(longley) faces(longley[1:9,]) set.seed(17) faces(matrix(sample(1:1000,128,),16,8),main="random faces") if(interactive()){ tke1 <- rep( list(list('slider',from=0,to=1,init=0.5,resolution=0.1)), 15) names(tke1) <- c('FaceHeight','FaceWidth','FaceShape','MouthHeight', 'MouthWidth','SmileCurve','EyesHeight','EyesWidth','HairHeight', 'HairWidth','HairStyle','NoseHeight','NoseWidth','EarWidth','EarHeight') tkfun1 <- function(...){ tmpmat <- rbind(Min=0,Adjust=unlist(list(...)),Max=1) faces(tmpmat, scale=FALSE) } tkexamp( tkfun1, list(tke1), plotloc='left', hscale=2, vscale=2 ) }
faces(rbind(1:3,5:3,3:5,5:7)) data(longley) faces(longley[1:9,]) set.seed(17) faces(matrix(sample(1:1000,128,),16,8),main="random faces") if(interactive()){ tke1 <- rep( list(list('slider',from=0,to=1,init=0.5,resolution=0.1)), 15) names(tke1) <- c('FaceHeight','FaceWidth','FaceShape','MouthHeight', 'MouthWidth','SmileCurve','EyesHeight','EyesWidth','HairHeight', 'HairWidth','HairStyle','NoseHeight','NoseWidth','EarWidth','EarHeight') tkfun1 <- function(...){ tmpmat <- rbind(Min=0,Adjust=unlist(list(...)),Max=1) faces(tmpmat, scale=FALSE) } tkexamp( tkfun1, list(tke1), plotloc='left', hscale=2, vscale=2 ) }
Plot Chernoff Faces of the dataset, rows represent subjects/observations, columns represent variables.
faces2(mat, which = 1:ncol(mat), labels = rownames(mat), nrows = ceiling(nrow(mat)/ncols), ncols = ceiling(sqrt(nrow(mat))), byrow = TRUE, scale = c("columns", "all", "center", "none"), fill = c(0.5, 0.5, 1, 0.5, 0.5, 0.3, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 0.5), ...)
faces2(mat, which = 1:ncol(mat), labels = rownames(mat), nrows = ceiling(nrow(mat)/ncols), ncols = ceiling(sqrt(nrow(mat))), byrow = TRUE, scale = c("columns", "all", "center", "none"), fill = c(0.5, 0.5, 1, 0.5, 0.5, 0.3, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 0.5), ...)
mat |
Matrix containing the data to plot. |
which |
Which columns correspond to which features (see details). |
labels |
Labels for the individual faces |
nrows |
Number of rows in the graphical layout |
ncols |
Number of columns in the graphical layout |
byrow |
Logical, should the faces be drawn rowwise or columnwise. |
scale |
Character, how should the data be scaled. |
fill |
What value to use for features not assocaiated with a column of data. |
... |
Additional arguments passed on to plotting functions. |
The features are: 1 Width of center 2 Top vs. Bottom width (height of split) 3 Height of Face 4 Width of top half of face 5 Width of bottom half of face 6 Length of Nose 7 Height of Mouth 8 Curvature of Mouth (abs < 9) 9 Width of Mouth 10 Height of Eyes 11 Distance between Eyes (.5-.9) 12 Angle of Eyes/Eyebrows 13 Circle/Ellipse of Eyes 14 Size of Eyes 15 Position Left/Right of Eyeballs/Eyebrows 16 Height of Eyebrows 17 Angle of Eyebrows 18 Width of Eyebrows
The face plotting routine needs the data values to be between 0 and 1
(inclusive). The scale
option controls how scaling will be
done on mat
: "columns" scales each column to range from 0 to
1, "all" scales the entire dataset to vary from 0 to 1, "center"
scales each column so that the mean of the column becomes 0.5 and all
other values are between 0 and 1, and "none" does no scaling assuming
that the data has already been scaled.
This function is run for its side effect of plotting and does not return anything.
If you choose to not scale the data and any data values are outside of the 0 to 1 range, then strange things may happen.
This function is based on code found on the internet, the good things come from there, any problems are likely due to my (Greg's) tweaking.
Original code by ; current implementation by Greg Snow [email protected]
Chernoff, H. (1973): The use of faces to represent statistiscal assoziation, JASA, 68, pp 361–368.
faces2(matrix( runif(18*10), nrow=10), main='Random Faces') if(interactive()){ tke2 <- rep( list(list('slider',from=0,to=1,init=0.5,resolution=0.1)), 18) names(tke2) <- c('CenterWidth','TopBottomWidth','FaceHeight','TopWidth', 'BottomWidth','NoseLength','MouthHeight','MouthCurve','MouthWidth', 'EyesHeight','EyesBetween','EyeAngle','EyeShape','EyeSize','EyeballPos', 'EyebrowHeight','EyebrowAngle','EyebrowWidth') tkfun2 <- function(...){ tmpmat <- rbind(Min=0,Adjust=unlist(list(...)),Max=1) faces2(tmpmat, scale='none') } tkexamp( tkfun2, list(tke2), plotloc='left', hscale=2, vscale=2 ) }
faces2(matrix( runif(18*10), nrow=10), main='Random Faces') if(interactive()){ tke2 <- rep( list(list('slider',from=0,to=1,init=0.5,resolution=0.1)), 18) names(tke2) <- c('CenterWidth','TopBottomWidth','FaceHeight','TopWidth', 'BottomWidth','NoseLength','MouthHeight','MouthCurve','MouthWidth', 'EyesHeight','EyesBetween','EyeAngle','EyeShape','EyeSize','EyeballPos', 'EyebrowHeight','EyebrowAngle','EyebrowWidth') tkfun2 <- function(...){ tmpmat <- rbind(Min=0,Adjust=unlist(list(...)),Max=1) faces2(tmpmat, scale='none') } tkexamp( tkfun2, list(tke2), plotloc='left', hscale=2, vscale=2 ) }
These functions create a plot showing the relationship between the prior probability, the LR (combination of sensitivity and specificity), and the posterior probability.
fagan.plot(probs.pre.test, LR, test.result="+") plotFagan(hscale=1.5, vscale=1.5, wait=FALSE) plotFagan2(hscale=1.5, vscale=1.5, wait=FALSE) plotFagan.old() plotFagan2.old()
fagan.plot(probs.pre.test, LR, test.result="+") plotFagan(hscale=1.5, vscale=1.5, wait=FALSE) plotFagan2(hscale=1.5, vscale=1.5, wait=FALSE) plotFagan.old() plotFagan2.old()
probs.pre.test |
The prior probability |
LR |
the likelihood ratio (sensitivity/(1-specificity)) |
test.result |
either '+' or '-' indicating whether you want the probability of the event or of not seeing the event |
hscale |
Horizontal scale, passed to |
vscale |
Vertical scale, passed to |
wait |
Should the R session wait for the window to close |
When Bayes theorem is expressed in terms of log-odds it turns out that
the posterior log-odds are a linear function of the prior log-odds
and the log likelihood ratio. These functions plot an axis on the left
with the prior log-odds, an axis in the middle representing the log
likelihood ratio and an axis on the right representing the posterior
log-odds. A line is then drawn from the prior probability on the left
through the LR in the center and extended to the posterior probability
on the right. The fagan.plot
creates the plot based on input
to the function. The plotFagan
and plotFagan2
functions set up Tk windows with sliders representing the possible
inputs and show how the plot and the posterior probability changes
when you adjust the inputs. The plotFagan
function creates
sliders for the prior probability and the LR, while the
plotFagan2
function replaces the LR slider with 2 sliders for
the sensitivity and specificity.
More detail on the plots and the math behind them can be found at the websites below.
The old functions are run for their side effects and do not return
a meaningful value. If wait
is FALSE then NULL is returned, if
wait
is TRUE, then a list with the current values is returned.
Guazzetti Stefano and Greg Snow [email protected]
Fagan TJ. Nomogram for Bayes theorem. N Engl J Med 1975;293(5):257-61. https://ebm.bmj.com/content/6/6/164.full
slider
fagan.plot(0.8, 2) fagan.plot(0.8, 0.95/(1-0.90) ) if(interactive()) { plotFagan() plotFagan2() }
fagan.plot(0.8, 2) fagan.plot(0.8, 0.95/(1-0.90) ) if(interactive()) { plotFagan() plotFagan2() }
These functions allow you to open a connection to a gnuplot process, send data and possibly other information to gnuplot for it to plot, then close gnuplot and clean up temporary files and variables. These functions are alpha level at best, use at your own risk.
gp.open(where='c:/progra~1/GnuPlot/bin/pgnuplot.exe') gp.close(pipe=gpenv$gp) gp.send(cmd='replot',pipe=gpenv$gp) gp.plot(x,y,type='p',add=FALSE, title=deparse(substitute(y)),pipe=gpenv$gp) gp.splot(x,y,z, add=FALSE, title=deparse(substitute(z)), pipe=gpenv$gp, datafile=tempfile())
gp.open(where='c:/progra~1/GnuPlot/bin/pgnuplot.exe') gp.close(pipe=gpenv$gp) gp.send(cmd='replot',pipe=gpenv$gp) gp.plot(x,y,type='p',add=FALSE, title=deparse(substitute(y)),pipe=gpenv$gp) gp.splot(x,y,z, add=FALSE, title=deparse(substitute(z)), pipe=gpenv$gp, datafile=tempfile())
where |
Path to GnuPlot Executable |
pipe |
The pipe object connected to GnuPlot (returned from
|
cmd |
Text string, the command to be sent verbatim to the GnuPlot process |
x |
The |
y |
the |
z |
the |
type |
Either 'p' or 'l' for plotting points or lines |
add |
Logical, should the data be added to the existing plot or start a new plot |
title |
The title or legend entry |
datafile |
The file to store the data in for transfer to gnuplot |
These functions provide a basic interface to the GnuPlot program (you
must have GnuPlot installed (separate install)), gp.open
runs
GnuPlot and establishes a pipe connection, gp.close
sends a
quite command to gnuplot and cleans up temporary variables and files,
gp.send
sends a command to the GnuPlot process verbatim, and
gp.plot
sends data and commands to the process to create a
standard scatterplot or line plot.
gp.open
returns and invisible copy of the pipe connection
object (to pass to other functions, but don't do this because it
doesn't work right yet).
The other 3 functions don't return anything meaningful. All functions are run for their side effects.
These functions create some temporary files and 2 temporary global
variables (.gp and .gp.tempfiles), running gp.close
will clean
these up (so use it).
These functions are still alpha level.
Greg Snow [email protected]
## Not run: x <- 1:10 y <- 3-2*x+x*x+rnorm(10) gp.open() gp.plot(x,y) gp.send('replot 3-2*x+x**2') tmp <- expand.grid(x=1:10, y=1:10) tmp <- transform(tmp, z=(x-5)*(y-3)) gp.splot(tmp$x, tmp$y, tmp$z) gp.close() ## End(Not run)
## Not run: x <- 1:10 y <- 3-2*x+x*x+rnorm(10) gp.open() gp.plot(x,y) gp.send('replot 3-2*x+x**2') tmp <- expand.grid(x=1:10, y=1:10) tmp <- transform(tmp, z=(x-5)*(y-3)) gp.splot(tmp$x, tmp$y, tmp$z) gp.close() ## End(Not run)
Compute the Highest Posterior Density Interval (HPD) from an inverse density function (hpd) or a vector of realizations of the distribution (emp.hpd).
hpd(posterior.icdf, conf=0.95, tol=0.00000001,...) emp.hpd(x, conf=0.95, lower, upper)
hpd(posterior.icdf, conf=0.95, tol=0.00000001,...) emp.hpd(x, conf=0.95, lower, upper)
posterior.icdf |
Function, the inverse cdf of the posterior distribution (usually a function whose name starts with 'q'). |
x |
A vector of realizations from the posterior distribution. |
conf |
Scalar, the credible level desired. |
tol |
Scalar, the tolerance for |
... |
Additional arguments to |
lower |
Optional lower bound on support of x. |
upper |
Optional upper bound on support of x. |
These functions compute the highest posterior density intervals
(sometimes called minimum length confidence intervals) for a Bayesian
posterior distribution. The hpd
function is used when you have
a function representing the inverse cdf (the common case with
conjugate families). The emp.hpd
function is used when you
have realizations of the posterior (when you have results from an MCMC
run).
A vector of length 2 with the lower and upper limits of the interval.
These functions assume that the posterior distribution is unimodal, they compute only 1 interval, not the set of intervals that are appropriate for multimodal distributions.
Greg Snow [email protected]
hdr
in the hdrcde package.
hpd(qbeta, shape1=50, shape2=250) tmp <- rbeta(10000, 50, 250) emp.hpd(tmp)
hpd(qbeta, shape1=50, shape2=250) tmp <- rbeta(10000, 50, 250) emp.hpd(tmp)
These functions create a scatterplot then you Hover the mouse pointer over a point in the plot and it will show an id label for that point.
HWidentify(x, y, label = seq_along(x), lab.col="darkgreen", pt.col="red", adj=c(0,0), clean=TRUE, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), ...) HTKidentify(x, y, label = seq_along(x), lab.col="darkgreen", pt.col="red", adj=c(0,0), xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), ...)
HWidentify(x, y, label = seq_along(x), lab.col="darkgreen", pt.col="red", adj=c(0,0), clean=TRUE, xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), ...) HTKidentify(x, y, label = seq_along(x), lab.col="darkgreen", pt.col="red", adj=c(0,0), xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), ...)
x |
x-coordinates to plot |
y |
y-coordinates to plot |
label |
Labels to show for each point |
lab.col |
The color to plot the labels |
pt.col |
The color of the highlighting point |
adj |
The adjustment of the labels relative to the cursor point. The default places the label so that its bottom left corner is at the curser, values below 0 or greater than 1 will move the label to not touch the cursor. |
clean |
Logical value, should any labels on the plot be removed at the end of the plotting. |
xlab |
Label for x-axis |
ylab |
Label for y-axis |
... |
additional arguments passed through to plot |
This is an alternative to the identify
function. The label
only shows up for the point currently closest to the mouse pointer.
When the mouse pointer moves closer to a different point, the label
changes to the one for the new point. The currently labeled point is
also highlighted. HWidentify only works on windows, HTKidentify
requires the tkrplot package.
These functions are run for their side effects, nothing meaningful is returned.
Greg Snow, [email protected]
if( interactive() ){ tmpx <- runif(25) tmpy <- rnorm(25) HTKidentify(tmpx,tmpy, LETTERS[1:25], pch=letters) }
if( interactive() ){ tmpx <- runif(25) tmpy <- rnorm(25) HTKidentify(tmpx,tmpy, LETTERS[1:25], pch=letters) }
Plot 1 panel from an xyplot, and optionally a 3d graph highligting the shown points, then allow you to interactively set the conditioning set of data to see the effects and help you better understand how xyplot works.
lattice.demo(x, y, z, show3d = TRUE)
lattice.demo(x, y, z, show3d = TRUE)
x |
The x variable to plot (numeric). |
y |
The y variable to plot (numeric). |
z |
The variable to condition on (numeric). |
show3d |
Logical, should a 3D cloud be shown as well. |
This function is intended to for demonstration purposes to help
understand what is happening in an xyplot
(lattice). When you
run the demo it will create a single panel from a conditioned
xyplot
and optionally a 3D cloud with the points included in
the panel highlighted. The function then opens a tcl/tk dialog box
that allows you to choose which points are included in the panel
(based on the conditioning variable). You can choose the center and
width of the shingle displayed and the graph will update to show the
new selection.
The intent for this function is for a teacher to show a class how
lattice graphics take slices of a 3d plot and show each slice
seperately. Students could then work through some examples on their
own to better understand what functions like xyplot
are doing
automatically.
No meaningful return value, this function is run for the side effects.
Greg Snow [email protected]
xyplot
in lattice package
if(interactive()){ require(stats) lattice.demo(quakes$long, quakes$lat, quakes$depth) }
if(interactive()){ require(stats) lattice.demo(quakes$long, quakes$lat, quakes$depth) }
Data on the Growth of The Church of Jesus Christ of Latter-day Saints (commonly known as the Mormon church (https://www.churchofjesuschrist.org/comeuntochrist)).
data(ldsgrowth)
data(ldsgrowth)
A data frame with 179 observations on the following 6 variables.
Year
Year from 1830 to 2008
Members
Total number of Members
Wards
Number of Wards and Branches (individual congregations)
Stakes
Number of Stakes (a group of wards/branches)
Missions
Number of Missions
Missionaries
Number of Missionaries called
The data comes from the church records and are as of December 31st of each year.
The church was officially organized on 6 April 1830 (hence the starting year of 1830).
The Missionaries
column represents the number of missionaries
called each year. Missionaries generally serve for about 2 years.
Deseret News 2010 Church News Almanac
data(ldsgrowth) with(ldsgrowth, plot(Year, log(Members)))
data(ldsgrowth) with(ldsgrowth, plot(Year, log(Members)))
Creates a scatterplot with a loess fit, then interactively shows the
window and case weights used to create the curve at the selected value
of x
.
loess.demo(x, y, span = 2/3, degree = 1, nearest = FALSE, xlim = numeric(0), ylim = numeric(0), verbose = FALSE)
loess.demo(x, y, span = 2/3, degree = 1, nearest = FALSE, xlim = numeric(0), ylim = numeric(0), verbose = FALSE)
x |
The |
y |
The |
span |
The relative width of the window, passed on to |
degree |
Degree of polynomial to use (0, 1, or 2), passed on to |
nearest |
Logical, should predictions be made at the point where you clicked (FALSE), or at the nearest x value of the data to where you clicked (TRUE). |
xlim |
Limits of the Horizonal axis. |
ylim |
Limits of the Vertical axis. |
verbose |
If true then print the x coordinate being predicted. |
This function demonstrates the underlying calculations of loess curves.
Given x
and y
vectors it will create a scatterplot and
add 2 loess fit lines (one using straight loess smooth with linear
interpolation and one that does a spline interpolation of the loess
fit).
The function then waits for the user to click on the plot. The
function then shows the window of points (centered at the x
value clicked on) used in the weighting for predicting that point and
shows a circle around each point in the window where the area of the
circle is proportional to the weight of that point in the linear fit.
The function also shows the linear (or quadratic) fit used to predict
at the selected point.
The basic steps of the loess algorithm (as demonstrated by the function) is that to predict the y-value for a given x-value the computer:
1. Find all the points within a window around the x-value (the width
of the window is based on the parameter span
).
2. Weight the points in the window with points nearest the x-value
having the highest weight.
3. Fit a weighted linear (quadratic) line to the points in the
window.
4. Use the y-value of the fitted line (curve) at the x-value to give
loess prediction at that x-value.
Clicking on another point in the graph will replot with the new situation.
Right click and select 'stop' to end the demonstration.
This function does not return anything, it is run purely for its side effects.
Greg Snow [email protected]
if(interactive()){ data(ethanol, package='lattice') attach(ethanol) loess.demo(E, NOx) # now click a few places, right click to end loess.demo(E, NOx, span=1.5) loess.demo(E, NOx, span=0.25) loess.demo(E, NOx, degree=0) loess.demo(E, NOx, degree=2) detach() }
if(interactive()){ data(ethanol, package='lattice') attach(ethanol) loess.demo(E, NOx) # now click a few places, right click to end loess.demo(E, NOx, span=1.5) loess.demo(E, NOx, span=0.25) loess.demo(E, NOx, degree=0) loess.demo(E, NOx, degree=2) detach() }
This function graphically shows log likelihoods for a set of data and the normal distribution and allows you to interactively change the parameter estimates to see the effect on the log likelihood.
mle.demo(x = rnorm(10, 10, 2), start.mean = mean(x) - start.sd, start.sd = 1.2 * sqrt(var(x)))
mle.demo(x = rnorm(10, 10, 2), start.mean = mean(x) - start.sd, start.sd = 1.2 * sqrt(var(x)))
x |
A vector of data (presumably from a normal distribution). |
start.mean |
The initial value for estimating the mean. |
start.sd |
The initial value for estimating the standard deviation. |
The function creates a plot with 3 panels: the top panel shows a
normal curve based on the current values of the mean and standard
deviation along with a vertical line for each point in x
(the
product of the heights of these lines is the likelihood, the sum of
the logs of their heights is the log likelihood).
The lower 2 plots show the profiles of the mean and standard deviation. The y-axis is the likelihoods of the parameters tried so far, and the x-axes are the mean and standard deviation tried. The point corresponding to the current parameter estimates will be solid red.
A Tk slider box is also created that allows you to change the current estimates of the mean and standard deviation to show the effect on the log likelihood and find the maximum likelihood estimate.
This function is run for its side effects and returns NULL.
Greg Snow [email protected]
fitdistr
in package MASS, mle
in
package stats4, slider
if(interactive()){ mle.demo() m <- runif(1, 50,100) s <- runif(1, 1, 10) x <- rnorm(15, m, s) mm <- mean(x) ss <- sqrt(var(x)) ss2 <- sqrt(var(x)*11/12) mle.demo(x) # now find the mle from the graph and compare it to mm, ss, ss2, m, and s }
if(interactive()){ mle.demo() m <- runif(1, 50,100) s <- runif(1, 1, 10) x <- rnorm(15, m, s) mm <- mean(x) ss <- sqrt(var(x)) ss2 <- sqrt(var(x)*11/12) mle.demo(x) # now find the mle from the graph and compare it to mm, ss, ss2, m, and s }
These functions/data matricies are examples of what can be passed as
the symb
argument in the my.symbols
function. They are
provided both to be used for some common symbols and as examples of
what can be passed as the symb
argument.
ms.polygram(n, r=1, adj=pi/2, ...) ms.polygon(n, r=1, adj=pi/2, ...) ms.filled.polygon(n, r=1, adj=pi/2, fg=par('fg'), bg=par('fg'), ... ) ms.male ms.female ms.arrows(angle, r=1, adj=0.5, length=0.1, ...) ms.sunflowers(n,r=0.3,adj=pi/2, ...) ms.image(img, transpose=TRUE, ...) ms.face(features, ...)
ms.polygram(n, r=1, adj=pi/2, ...) ms.polygon(n, r=1, adj=pi/2, ...) ms.filled.polygon(n, r=1, adj=pi/2, fg=par('fg'), bg=par('fg'), ... ) ms.male ms.female ms.arrows(angle, r=1, adj=0.5, length=0.1, ...) ms.sunflowers(n,r=0.3,adj=pi/2, ...) ms.image(img, transpose=TRUE, ...) ms.face(features, ...)
n |
The number of sides for polygons and polygrams, the number of petals(lines) for sunflowers. |
r |
The radius of the enclosing circle for polygons and polygrams (1 means that it will pretty much fill the bounding square). For sunflowers this is the radius (relative to the inches square) of the inner circle. For arrows this controls the length of the arrow, a value of 2 means the length of the arrow will be the same as inches (but it may then stick out of the box if adj != 1). |
adj |
For polygons, polygrams, and sunflowers this is the angle in radians that the first corner/point will be. The default puts a corner/point straight up, this can be used to rotate the symbols. For arrows, this determines the positioning of the arrow, a value of 0 means the arrow will start at the x,y point and point away from it, 0.5 means the arrow will be centered at x,y and 1 means that the arrow will end (point at) x,y. |
fg , bg
|
Colors for the filled polygons. |
angle |
The angle in radians that the arrow will point. |
length |
The length of the arrow head (see
|
img |
A 3 dimensional array representing an image such as produced by the png or EBImage packages. |
transpose |
Should the image be tranposed, use TRUE for images imported using package png and FALSE for images imported using EBImage. |
features |
A list of data representing the features of
the faces, each element represents 1 face and the values need to be
scaled between 0 and 1, see |
... |
additional parameters that will be passed to plotting functions or be ignored. |
These functions/matricies can be passed as the symb
argument to
the my.symbols
function. The represent examples that can be
used to create your own symbols or may be used directly.
These functions either return a 2 column matrix of points to be passed
to lines
or NULL
.
Greg Snow [email protected]
my.symbols
, polygon
,
arrows
, lines
, faces
, also see
rasterImage
for an alternative to ms.image
plot(1:10,1:10) my.symbols(1:10,1:10, ms.polygram, n=1:10, r=seq(0.5,1,length.out=10), inches=0.3) my.symbols(1:10,1:10, ms.polygon, n=1:10, add=FALSE, inches=0.3) my.symbols(1:5, 5:1, ms.filled.polygon, add=FALSE, n=3:7, fg='green', bg=c('red','blue','yellow','black','white'), inches=0.3 ) my.symbols( 1:10, 1:10, ms.female, inches=0.3, add=FALSE) my.symbols( 1:10, 10:1, ms.male, inches=0.3, add=TRUE) plot(1:10, 1:10) my.symbols(1:10, 1:10, ms.arrows, angle=runif(10)*2*pi, inches=0.5, adj=seq(0,1,length.out=10), symb.plots=TRUE) my.symbols(1:10, 1:10, ms.sunflowers, n=1:10, inches=0.3, add=FALSE) if( require(png) ) { img <- readPNG(system.file("img", "Rlogo.png", package="png")) my.symbols( runif(10), runif(10), ms.image, MoreArgs=list(img=img), inches=0.5, symb.plots=TRUE, add=FALSE) } tmp.mtcars <- scale(mtcars, center=sapply(mtcars,min), scale=sapply(mtcars,function(x) diff(range(x))) ) tmp2.mtcars <- lapply( seq_len(nrow(tmp.mtcars)), function(i) tmp.mtcars[i,] ) my.symbols(mtcars$wt, mtcars$mpg, ms.face, inches=0.3, features=tmp2.mtcars, add=FALSE)
plot(1:10,1:10) my.symbols(1:10,1:10, ms.polygram, n=1:10, r=seq(0.5,1,length.out=10), inches=0.3) my.symbols(1:10,1:10, ms.polygon, n=1:10, add=FALSE, inches=0.3) my.symbols(1:5, 5:1, ms.filled.polygon, add=FALSE, n=3:7, fg='green', bg=c('red','blue','yellow','black','white'), inches=0.3 ) my.symbols( 1:10, 1:10, ms.female, inches=0.3, add=FALSE) my.symbols( 1:10, 10:1, ms.male, inches=0.3, add=TRUE) plot(1:10, 1:10) my.symbols(1:10, 1:10, ms.arrows, angle=runif(10)*2*pi, inches=0.5, adj=seq(0,1,length.out=10), symb.plots=TRUE) my.symbols(1:10, 1:10, ms.sunflowers, n=1:10, inches=0.3, add=FALSE) if( require(png) ) { img <- readPNG(system.file("img", "Rlogo.png", package="png")) my.symbols( runif(10), runif(10), ms.image, MoreArgs=list(img=img), inches=0.5, symb.plots=TRUE, add=FALSE) } tmp.mtcars <- scale(mtcars, center=sapply(mtcars,min), scale=sapply(mtcars,function(x) diff(range(x))) ) tmp2.mtcars <- lapply( seq_len(nrow(tmp.mtcars)), function(i) tmp.mtcars[i,] ) my.symbols(mtcars$wt, mtcars$mpg, ms.face, inches=0.3, features=tmp2.mtcars, add=FALSE)
This function draws symbols on a plot. It is similar to
the builtin symbols
function with the difference that it plots
symbols defined by the user rather than a prespecified set of
symbols.
my.symbols(x, y=NULL, symb, inches=1, xsize, ysize, add=TRUE, vadj=0.5, hadj=0.5, symb.plots=FALSE, xlab=deparse(substitute(x)), ylab=deparse(substitute(y)), main=NULL, xlim=NULL, ylim=NULL, linesfun=lines, ..., MoreArgs)
my.symbols(x, y=NULL, symb, inches=1, xsize, ysize, add=TRUE, vadj=0.5, hadj=0.5, symb.plots=FALSE, xlab=deparse(substitute(x)), ylab=deparse(substitute(y)), main=NULL, xlim=NULL, ylim=NULL, linesfun=lines, ..., MoreArgs)
x , y
|
The |
symb |
Either a matrix, list, or function defining the symbol to
be plotted. If it is a matrix or list it needs to be formatted that
it can be passed directly to the |
inches |
The size of the square containing the symbol in inches
(note: unlike |
xsize |
The width of the bounding box(s) of the symbols in the same
units as the |
ysize |
The height of the bounding box(s) of the symbols in the
same units as the |
add |
if 'add' is 'TRUE' then the symbols are added to the existing plot, otherwise a new plot is created. |
vadj , hadj
|
Numbers between 0 and 1 indicating how 'x' and 'y' specify the location of the symbol. The defaults center the symbol at x,y; 0 means put the bottom/left at x,y; and 1 means put the top/right of the symbol at x,y. |
symb.plots |
If |
xlab , ylab , main , xlim , ylim
|
If 'add' is 'FALSE' these are
passed to the |
linesfun |
The function to draw the lines if the function does not do its own drawing. The default is |
... |
Additional arguments will be replicated to the same length
as |
MoreArgs |
A list with any additional arguments to be passed to
the |
The symb
argument can be a 2 column matrix or a list with
components 'x' and 'y' that defines points on the interval [-1,1] that
will be connected with lines to draw the symbol. If you want a closed
polygon then be sure to replicate the 1st point as the last point.
If any point contains an NA then the line will not be drawn to or from that point. This can be used to create a symbol with disjoint parts that should not be connected.
If symb
is a function then it should include a '...' argument
along with any arguments to define the symbol. Any unmatched
arguments that end up in the '...' argument will be replicated to the
same length as 'x' (using the rep
function) then the values
will be passed one at a time to the symb
function. If
MoreArgs
is specified, the elements of it will also be passed
to symb
without modification. The symb
function can
either return a matrix or list with the points that will then be
passed to the lines
function (see above). Or the function can
call the plotting functions itself (set symb.plots
to TRUE).
High level plotting can be done (plot
, hist
, and other
functions), or low level plotting functions (lines
,
points
, etc) can be used; in this case they should add things
to a plot with 'x' and 'y' limits of -1 to 1.
The size of the symbols can be specified by using inches
in
which case the symbol will be set inside of squares whose sizes are
inches
size based on the plotting device. The size can also be
set using xsize
and/or ysize
which use the same units as
the x
and/or y
variables. If only one is specified then
the box will be square. If both are specified and they do not match
the aspect ratio of the plot then the bounding box will not be square
and the symbol will be distorted.
This function is run for its side effect of plotting, it returns an invisible NULL.
Since the '...' argument is passed to both lines
and
symb
, the symb
function should have a '...' argument so
that it will ignore any additional arguments.
Arguments such as 'type' can be passed through the '...' argument if you want the symbol made of something other than lines.
Plotting coordinates and sizes are based on the size of the device at the time the function is called. If you resize the device after plotting, all bets are off.
Currently missing values in x
or y
are not handled
well. It is best if remove all missing values first.
Greg Snow [email protected]
symbols
, subplot
,
mapply
, ms.polygram
, lines
# symb is matrix my.symbols( 1:10, runif(10), ms.male, add=FALSE, xlab='x', ylab='y', inches=0.3, col=c('blue','green'), xlim=c(0,11), ylim=c(-0.1,1.1)) my.symbols( (1:10)+0.5, runif(10), ms.female, add=TRUE, inches=0.3, col=c('red','green') ) # symb is function returning matrix plot(1:10, 1:10) my.symbols( 1:10, 1:10, ms.polygram, n=1:10, inches=0.3 ) # symb is plotting function # create a variation on monthplot fit <- lm( log(co2) ~ time(co2) ) fit.r <- resid(fit) x <- 1:12 y <- tapply(fit.r, cycle(co2), mean) tmp.r <- split( fit.r, cycle(co2) ) tmp.r <- lapply( tmp.r, function(x) x-mean(x) ) yl <- do.call('range',tmp.r) tmpfun <- function(w,data,ylim,...){ tmp <- data[[w]] plot(seq(along=tmp),tmp, type='l', xlab='',ylab='', axes=FALSE, ylim=ylim) lines(par('usr')[1:2], c(0,0), col='grey') } my.symbols(x,y, symb=tmpfun, inches=0.4, add=FALSE, symb.plots=TRUE, xlab='Month',ylab='Adjusted CO2', xlim=c(0.5,12.5), ylim=c(-0.012,0.012), w=1:12, MoreArgs=list(data=tmp.r,ylim=yl) ) # using xsize and ysize plot( 1:10, (1:10)*100, type='n', xlab='', ylab='' ) my.symbols( 5, 500, ms.polygon, n=250, inches=1.5 ) my.symbols( 5, 500, ms.polygon, n=250, xsize=2, col='blue' ) my.symbols( 5, 500, ms.polygon, n=250, ysize=200, col='green' ) my.symbols( 5, 500, ms.polygon, n=250, xsize=2, ysize=200, col='red' ) abline( v=c(4,6), col='grey' ) abline( h=c(400, 600), col='grey' ) # hand crafted hexagonal grid x1 <- seq(0, by=2*sqrt(3), length.out=10) y1 <- seq(0, by=3, length.out=10) mypoints <- expand.grid(x=x1, y=y1) mypoints[,1] <- mypoints[,1] + rep( c(0,sqrt(3)), each=10, length.out=100 ) plot(mypoints, asp=1, xlim=c(-2,35)) my.symbols(mypoints, symb=ms.filled.polygon, n=6, inches=par('pin')[1]/(diff(par('usr')[1:2]))*4, bg=paste('gray',1:100,sep=''), fg='green' )
# symb is matrix my.symbols( 1:10, runif(10), ms.male, add=FALSE, xlab='x', ylab='y', inches=0.3, col=c('blue','green'), xlim=c(0,11), ylim=c(-0.1,1.1)) my.symbols( (1:10)+0.5, runif(10), ms.female, add=TRUE, inches=0.3, col=c('red','green') ) # symb is function returning matrix plot(1:10, 1:10) my.symbols( 1:10, 1:10, ms.polygram, n=1:10, inches=0.3 ) # symb is plotting function # create a variation on monthplot fit <- lm( log(co2) ~ time(co2) ) fit.r <- resid(fit) x <- 1:12 y <- tapply(fit.r, cycle(co2), mean) tmp.r <- split( fit.r, cycle(co2) ) tmp.r <- lapply( tmp.r, function(x) x-mean(x) ) yl <- do.call('range',tmp.r) tmpfun <- function(w,data,ylim,...){ tmp <- data[[w]] plot(seq(along=tmp),tmp, type='l', xlab='',ylab='', axes=FALSE, ylim=ylim) lines(par('usr')[1:2], c(0,0), col='grey') } my.symbols(x,y, symb=tmpfun, inches=0.4, add=FALSE, symb.plots=TRUE, xlab='Month',ylab='Adjusted CO2', xlim=c(0.5,12.5), ylim=c(-0.012,0.012), w=1:12, MoreArgs=list(data=tmp.r,ylim=yl) ) # using xsize and ysize plot( 1:10, (1:10)*100, type='n', xlab='', ylab='' ) my.symbols( 5, 500, ms.polygon, n=250, inches=1.5 ) my.symbols( 5, 500, ms.polygon, n=250, xsize=2, col='blue' ) my.symbols( 5, 500, ms.polygon, n=250, ysize=200, col='green' ) my.symbols( 5, 500, ms.polygon, n=250, xsize=2, ysize=200, col='red' ) abline( v=c(4,6), col='grey' ) abline( h=c(400, 600), col='grey' ) # hand crafted hexagonal grid x1 <- seq(0, by=2*sqrt(3), length.out=10) y1 <- seq(0, by=3, length.out=10) mypoints <- expand.grid(x=x1, y=y1) mypoints[,1] <- mypoints[,1] + rep( c(0,sqrt(3)), each=10, length.out=100 ) plot(mypoints, asp=1, xlim=c(-2,35)) my.symbols(mypoints, symb=ms.filled.polygon, n=6, inches=par('pin')[1]/(diff(par('usr')[1:2]))*4, bg=paste('gray',1:100,sep=''), fg='green' )
This dataset is approximately bell shaped, but with some outliers. It is meant to be used for demonstration purposes. If students are tempted to throw out all outliers, then have them work with this data (or use a scaled/centered/shuffled version as errors in a regression problem) and see how many throw away 3/4 of the data before rethinking their strategy.
data(outliers)
data(outliers)
The format is: num [1:100] -1.548 0.172 -0.638 0.233 -0.228 ...
This is simulated data meant to demonstrate "outliers".
Simulated, see the examples section.
data(outliers) qqnorm(outliers) qqline(outliers) hist(outliers) o.chuck <- function(x) { # function to throw away outliers qq <- quantile(x, c(1,3)/4, names=FALSE) r <- diff(qq) * 1.5 tst <- x < qq[1] - r | x > qq[2] + r if(any(tst)) { cat('Removing ', paste(x[tst], collapse=', '), '\n') x <- x[!tst] out <- Recall(x) } else { out <- x } out } x <- o.chuck( outliers ) length(x) if(require(MASS)) { char2seed('robust') x <- 1:100 y <- 3 + 2*x + sample(scale(outliers))*10 plot(x,y) fit <- lm(y~x) abline(fit, col='red') fit.r <- rlm(y~x) abline(fit.r, col='blue', lty='dashed') rbind(coef(fit), coef(fit.r)) length(o.chuck(resid(fit))) } ### The data was generated using code similar to: char2seed('outlier') outliers <- rnorm(25) dir <- 1 while( length(outliers) < 100 ){ qq <- quantile(c(outliers, dir*Inf), c(1,3)/4) outliers <- c(outliers, qq[ 1.5 + dir/2 ] + dir*1.55*diff(qq) + dir*abs(rnorm(1)) ) dir <- -dir }
data(outliers) qqnorm(outliers) qqline(outliers) hist(outliers) o.chuck <- function(x) { # function to throw away outliers qq <- quantile(x, c(1,3)/4, names=FALSE) r <- diff(qq) * 1.5 tst <- x < qq[1] - r | x > qq[2] + r if(any(tst)) { cat('Removing ', paste(x[tst], collapse=', '), '\n') x <- x[!tst] out <- Recall(x) } else { out <- x } out } x <- o.chuck( outliers ) length(x) if(require(MASS)) { char2seed('robust') x <- 1:100 y <- 3 + 2*x + sample(scale(outliers))*10 plot(x,y) fit <- lm(y~x) abline(fit, col='red') fit.r <- rlm(y~x) abline(fit.r, col='blue', lty='dashed') rbind(coef(fit), coef(fit.r)) length(o.chuck(resid(fit))) } ### The data was generated using code similar to: char2seed('outlier') outliers <- rnorm(25) dir <- 1 while( length(outliers) < 100 ){ qq <- quantile(c(outliers, dir*Inf), c(1,3)/4) outliers <- c(outliers, qq[ 1.5 + dir/2 ] + dir*1.55*diff(qq) + dir*abs(rnorm(1)) ) dir <- -dir }
This function is similar to the pairs
function, but instead of
doing all pairwise plots, it takes 2 matricies or data frames and does
all combinations of the first on the x-axis with the 2nd on the
y-axis. Used with pairs and subsets can spread a scatterplot matrix
accross several pages.
pairs2(x, y, xlabels, ylabels, panel = points, ..., row1attop = TRUE, gap = 1)
pairs2(x, y, xlabels, ylabels, panel = points, ..., row1attop = TRUE, gap = 1)
x |
Matrix or data frame of variables to be used as the x-axes. |
y |
Matrix or data frame of variables to be used as the y-axes. |
xlabels |
Labels for x variables (defaults to colnames of |
ylabels |
Labels for y variables (defaults to colnames of |
panel |
Function to do the plotting (see |
... |
additional arguments passed to graphics functions |
row1attop |
Logical, should the 1st row be the top. |
gap |
Distance between plots. |
This functios is similar to the pairs
function, but by giving
it 2 sets of data it only does the combinations between them. Think
of it as giving the upper right or lower left set of plots from
pairs
. If a regular scatterplot matrix is too small on the
page/device then use pairs
on subsets of the data to get the
diagonal blocks of a scatterplot matrix and this function to get the
off diagonal blocks.
This function is run for the side effect of the plot. It does not return anything useful.
Large amounts of the code for this function were blatently
borrowed/stolen from the pairs
function, the credit for the
useful parts should go to the original authors, blame for any problems
should go to me. This function is also released under GPL since much
of it comes from GPL code.
Greg Snow, [email protected]
pairs
, splom
in the lattice package
pairs2(iris[,1:2], iris[,3:4], col=c('red','green','blue')[iris$Species]) # compare the following plot: pairs(state.x77, panel=panel.smooth) # to the following 4 plots pairs(state.x77[,1:4], panel=panel.smooth) pairs(state.x77[,5:8], panel=panel.smooth) pairs2( state.x77[,1:4], state.x77[,5:8], panel=panel.smooth) pairs2( state.x77[,5:8], state.x77[,1:4], panel=panel.smooth)
pairs2(iris[,1:2], iris[,3:4], col=c('red','green','blue')[iris$Species]) # compare the following plot: pairs(state.x77, panel=panel.smooth) # to the following 4 plots pairs(state.x77[,1:4], panel=panel.smooth) pairs(state.x77[,5:8], panel=panel.smooth) pairs2( state.x77[,1:4], state.x77[,5:8], panel=panel.smooth) pairs2( state.x77[,5:8], state.x77[,1:4], panel=panel.smooth)
This function draws symbols on a lattice plot. It is similar to
the builtin symbols
function with the difference that it plots
symbols defined by the user rather than a prespecified set of
symbols.
panel.my.symbols(x, y, symb, inches=1, polygon=FALSE, ..., symb.plots=FALSE, subscripts, MoreArgs)
panel.my.symbols(x, y, symb, inches=1, polygon=FALSE, ..., symb.plots=FALSE, subscripts, MoreArgs)
x , y
|
The |
symb |
Either a matrix, list, or function defining the symbol to
be plotted. If it is a matrix or list it needs to be formatted that
it can be passed directly to the |
inches |
The size of the square containing the symbol in inches
(note: unlike |
polygon |
If TRUE, use |
symb.plots |
Currently not implemented. |
... |
Additional arguments will be replicated to the same length
as |
subscripts |
subscripts for the current panel |
MoreArgs |
A list with any additional arguments to be passed to
the |
The symb
argument can be a 2 column matrix or a list with
components 'x' and 'y' that defines points on the interval [-1,1] that
will be connected with lines to draw the symbol. If you want a closed
polygon then be sure to replicate the 1st point as the last point or
use the polygon
option.
If any point contains an NA then the line will not be drawn to or from that point. This can be used to create a symbol with disjoint parts that should not be connected.
If symb
is a function then any unmatched
arguments that end up in the '...' argument will be replicated to the
same length as 'x' (using the rep
function) then the values
will be passed one at a time to the symb
function. If
MoreArgs
is specified, the elements of it will also be passed
to symb
without modification. The symb
function can
either return a matrix or list with the points that will then be
passed to the llines
function (see above).
This function is run for its side effect of plotting, it returns an invisible NULL.
Plotting coordinates and sizes are based on the size of the device at the time the function is called. If you resize the device after plotting, all bets are off.
Greg Snow [email protected]
symbols
, my.symbols
, subplot
,
mapply
, ms.polygram
, lines
if(require(lattice)) { tmpdf <- data.frame( x=1:10, y=1:10, g=factor(rep( c("A","B"), each=5 )), z=c(1:5,5:1) ) xyplot( y ~ x, tmpdf, panel=panel.my.symbols, symb=ms.female, inches=0.3 ) xyplot( y ~ x | g, tmpdf, panel=panel.my.symbols, symb=ms.male, inches=0.3) xyplot( y ~ x, tmpdf, panel=panel.superpose, groups=g, panel.groups= function(group.number, ...) { if(group.number==1) { panel.my.symbols(..., symb=ms.male) } else { panel.my.symbols(..., symb=ms.female) } }, inches=0.3 ) xyplot( y ~ x, tmpdf, panel=panel.my.symbols, symb=ms.polygram, n=tmpdf$z, inches=0.3) xyplot( y ~ x | g, tmpdf, panel=panel.my.symbols, symb=ms.polygram, n=tmpdf$z, inches=0.3) xyplot( y ~ x, tmpdf, panel=panel.superpose, groups=g, panel.groups = panel.my.symbols, inches=0.3, symb=ms.polygon, n=tmpdf$z, polygon=TRUE, adj=rep(c(0,pi/4),5) ) }
if(require(lattice)) { tmpdf <- data.frame( x=1:10, y=1:10, g=factor(rep( c("A","B"), each=5 )), z=c(1:5,5:1) ) xyplot( y ~ x, tmpdf, panel=panel.my.symbols, symb=ms.female, inches=0.3 ) xyplot( y ~ x | g, tmpdf, panel=panel.my.symbols, symb=ms.male, inches=0.3) xyplot( y ~ x, tmpdf, panel=panel.superpose, groups=g, panel.groups= function(group.number, ...) { if(group.number==1) { panel.my.symbols(..., symb=ms.male) } else { panel.my.symbols(..., symb=ms.female) } }, inches=0.3 ) xyplot( y ~ x, tmpdf, panel=panel.my.symbols, symb=ms.polygram, n=tmpdf$z, inches=0.3) xyplot( y ~ x | g, tmpdf, panel=panel.my.symbols, symb=ms.polygram, n=tmpdf$z, inches=0.3) xyplot( y ~ x, tmpdf, panel=panel.superpose, groups=g, panel.groups = panel.my.symbols, inches=0.3, symb=ms.polygon, n=tmpdf$z, polygon=TRUE, adj=rep(c(0,pi/4),5) ) }
This plays the lateral thinking game Petals Around the Rose. This is a game where 5 regular dice are rolled and the players then try to figure out how many petals are around the rose.
petals(plot = TRUE, txt = TRUE)
petals(plot = TRUE, txt = TRUE)
plot |
Should the dice be plotted to the current/default graphics device. |
txt |
Should the dice be shown in the console window using text. |
At least one of the arguments plot
and txt
needs to be
true, otherwise you will be guessing blind (or testing your psychic
abilities).
The game is usually played with 5 physical dice, one person who knows the rules (the potentate of the rose, here the computer), and one or more players trying to learn the puzzle. The potentate can only give the players the following 3 rules:
The name of the game is "Petals Around the Rose" and the name is significant.
The answer is always 0 or an even number.
The potentate can tell the answer for any roll after any guesses are made.
The potentate (or other player) then rolls the 5 dice and any players are then allowed to guess. The potentate either confirms a correct guess or tells the correct answer, then the game continues with another roll. Players are not to discuss their reasoning so that each can solve it themselves. When a player thinks they have worked out the reasoning they demonstrate it by getting correct guesses, but not by discussing it with anyone. Generally 6 correct guesses in a row is considered evidence that they have figured out the rules and they are then considered a potentate of the rose.
For this implementation the computer will simulate the role of 5 dice and display the results and ask for a guess of how many petals are around the rose. The player then enters their guess and the computer then either confirms that it is correct or gives the correct answer.
Pressing enter without making a guess ends the game.
This function only returns NULL, it is run for its side effects.
Casual viewing of the function source code is unlikely to reveal the secret (and therefore this could be used as an example of one way to disguise portions of code from casual examination). More on disguising source code is at https://stat.ethz.ch/pipermail/r-devel/2011-October/062236.html.
Some basic debugging can reveal the secret, but that would be cheating and an admission that such a simple game has defeated you, so don't do it, just keep playing until you figure it out.
Greg Snow, [email protected]
http://www.borrett.id.au/computing/petals-bg.htm
dice
if(interactive()){ petals() }
if(interactive()){ petals() }
This function attempts to create a script that will recreate the current plot (in the graphics window). You can then edit any parts of the script that you want changed and rerun to get the modified plot.
plot2script(file='clipboard')
plot2script(file='clipboard')
file |
The filename (the clipboard by default) for the script to create or append to. |
This function works with the graphics window and mainly traditional graphics (it may work with lattice or other graphics, but has not really been tested with those).
This function creates a script file (or puts it on the clipboard so
that you can past into a script window or text editor) that will
recreate the current graph in the current graph window. The script
consists of very low level functions (calls to plot.window
and
axis
rather than letting plot
handle all this).
If you want the higher level functions that were actually used, then
use the history
or savehistory
commands (this will
probably be the better method for most cases).
Some of the low level plotting functions use different arguments to
the internal version than the user callable version (box
for
example), the arguments to these functions may need to be editted
before the full script will run correctly.
The lengths of command lines between the creation of the script and what can be run in R do not always match, you may need to manually wrap long lines in the script before it will run properly.
This function is run for its side effects and does not return anything meaningful.
For any serious projects it is best to put your code into a script to begin with and edit the original script rather than using this function.
This function depends on the recordPlot
function which can
change in any version. Therefore this function should not be
considered stable.
Greg Snow [email protected]
history
, savehistory
,
recordPlot
, source
if(interactive()){ # create a plot plot(runif(10),rnorm(10)) lines( seq(0,1,length=10), rnorm(10,1,3) ) # create the script plot2script() # now paste the script into a script window or text processor. # edit the ranges in plot.window() and change some colors or # other options. Then run the script. }
if(interactive()){ # create a plot plot(runif(10),rnorm(10)) lines( seq(0,1,length=10), rnorm(10,1,3) ) # create the script plot2script() # now paste the script into a script window or text processor. # edit the ranges in plot.window() and change some colors or # other options. Then run the script. }
Create graphs of a normal test statistic under the null and alternative hypotheses to graphically show the idea of power.
power.examp(n = 1, stdev = 1, diff = 1, alpha = 0.05, xmin = -2, xmax = 4) run.power.examp(hscale=1.5, vscale=1.5, wait=FALSE) run.power.examp.old()
power.examp(n = 1, stdev = 1, diff = 1, alpha = 0.05, xmin = -2, xmax = 4) run.power.examp(hscale=1.5, vscale=1.5, wait=FALSE) run.power.examp.old()
n |
The sample size for the test statistic. |
stdev |
The standard deviation of the population. |
diff |
The true difference in means (alternate hypothesis). |
alpha |
The type I error rate to use for the test. |
xmin |
The minimum x value to show on the graph. |
xmax |
The maximum x value to show on the graph. |
hscale |
Controls width of plot, passed to |
vscale |
Controls height of plot, passed to |
wait |
Should R wait for the window to close. |
This function will draw 2 graphs representing an upper-tailed test of hypothesis.
The upper panel represents the test statistic under the
null hypothesis that the true mean (or mean difference) is 0. It then
also shows the upper tail area equal to alpha
and the rejection
region for the test statistic.
The lower panel shows the normal distribution for the test statistic
under the alternative hypothesis where the true mean (or mean
difference) is diff
. Using the rejection region from the upper
panel it shades the upper tail area that corresponds to the power of
the test.
Both curves are affected by the specified stdev
and sample size
n
.
The function run.power.examp
will in addition create a Tk
slider box that will allow you to interactively change the values of
stdev
, diff
, alpha
, and n
to dynamically
see the effects of the change on the graphs and on the power of the
test.
This can be used to demonstrate the concept of power, show the effect of sample size on power, show the inverse relationship between the type I and type II error rates, and show how power is dependent on the true mean (or difference) and the population standard deviation.
power.examp
invisibly returns the power computed.
run.power.examp
returns a list with the parameter settings and
the power if wait
is TRUE.
run.power.examp.old
does not return anything meaningful.
Greg Snow [email protected]
power.examp() power.examp(n=25) power.examp(alpha=0.1)
power.examp() power.examp(n=25) power.examp(alpha=0.1)
Place data points on a graph to demonstrate concepts related to correlation and regression.
put.points.demo(x = NULL, y = NULL, lsline = TRUE)
put.points.demo(x = NULL, y = NULL, lsline = TRUE)
x |
x-coordinates for initial points. |
y |
y-coordinates for initial points. |
lsline |
Logical, should the ls regresion line be included. |
The plot area is divided into 2 sections, the left section shows a scatterplot of your points, the right panel controls what happens when you click in the left panel.
The top of the right panel has an "end" button that you click on to end the demonstration.
The middle right panel toggles the least squares line and information.
The bottom right panel has radio buttons that determine what clicking in the left panel will do, the options are to add a point, delete a point, or move a point.
To move a point click on the point you want to move, it will become solid, then click in the place you want it to move to.
When deleting or moving points, the closest point to where you click will be deleted or moved, even if you click in an empty area.
Whenever you add, delete, or move a point the correlation, r^2, and regression line will be updated. You can start with a set of points then demonstrate what happens to the correlation and regression line when outliers are added or important points are moved or deleted.
This function does not return anything.
Greg Snow [email protected]
if(interactive()){ put.points.demo() x <- rnorm(25, 5, 1) y <- x + rnorm(25) put.points.demo(x,y) }
if(interactive()){ put.points.demo() x <- rnorm(25, 5, 1) y <- x + rnorm(25) put.points.demo(x,y) }
Simulate and plot p-values from a normal or binomial based test under various conditions. When all the assumptions are true, the p-values should follow an approximate uniform distribution. These functions show that along with how violating the assumptions changes the distribution of the p-values.
Pvalue.norm.sim(n = 50, mu = 0, mu0 = 0, sigma = 1, sigma0 = sigma, test= c("z", "t"), alternative = c("two.sided", "less", "greater", "<>", "!=", "<", ">"), alpha = 0.05, B = 10000) Pvalue.binom.sim(n=100, p=0.5, p0=0.5, test=c('exact','approx'), alternative=c('two.sided', 'less', 'greater', '<>','!=','<','>'), alpha=0.05, B=1000) run.Pvalue.norm.sim() run.Pvalue.binom.sim()
Pvalue.norm.sim(n = 50, mu = 0, mu0 = 0, sigma = 1, sigma0 = sigma, test= c("z", "t"), alternative = c("two.sided", "less", "greater", "<>", "!=", "<", ">"), alpha = 0.05, B = 10000) Pvalue.binom.sim(n=100, p=0.5, p0=0.5, test=c('exact','approx'), alternative=c('two.sided', 'less', 'greater', '<>','!=','<','>'), alpha=0.05, B=1000) run.Pvalue.norm.sim() run.Pvalue.binom.sim()
n |
Sample Size for each simulated dataset |
mu |
Simulation mean for samples |
mu0 |
Hypothesized mean for tests |
sigma |
Simulation SD for samples |
sigma0 |
Hypothesized SD for tests, if blank or missing, use the sample SD in the tests |
p |
Simulation proportion for samples |
p0 |
Hypothesized proportion for tests |
test |
Which test to use, "z" or "t" tests for normal, "exact" (binomial) or "approx" (normal approximation) for binomial |
alternative |
Direction for alternative hypothesis |
alpha |
alpha level for test (optional) |
B |
Number of simulated datasets |
These functions generate B
samples from either a normal or
binomial distribution, then compute the P-values for the test of
significance on each sample and plot the P-values.
The run.Pvalue.norm.sim
and run.Pvalue.binom.sim
functions are GUI wrappers for the other 2 functions allowing you to
change the parameters and click on "refresh" to run a new set of
simulations.
Using NA
for sigma0
will result in the sample standard
deviations being used (leave blank in the GUI).
When the simulation conditions and the hypothesized values match, the distributions of the p-values will be approximately uniform. Changing the parameter of interest will show the idea of power. Changing the other parameters can show the effects of assumptions not being met.
The P-values are invisibly returned.
Note: the 2-sided p-values for the binomial may not match the results from binom.test and prop.test. The method used here is an approximation for speed.
Greg Snow, [email protected]
Murdock, D, Tsai, Y, and Adcock, J (2008) _P-Values are Random Variables_. The American Statistician. (62) 242-245.
t.test
, z.test
,
binom.test
, prop.test
, tkexamp
if(interactive()) { run.Pvalue.norm.sim() run.Pvalue.binom.sim() }
if(interactive()) { run.Pvalue.norm.sim() run.Pvalue.binom.sim() }
Open an rgl window, plot either a representation of a coin or a die then animate the flipping/rolling.
rgl.coin(x, col = "black", heads = x[[1]], tails = x[[2]], ...) rgl.die(x=1:6, col.cube = "white", col.pip = "black", sides = x, ...) flip.rgl.coin(side = sample(2, 1), steps = 150) roll.rgl.die(side = sample(6, 1), steps = 250)
rgl.coin(x, col = "black", heads = x[[1]], tails = x[[2]], ...) rgl.die(x=1:6, col.cube = "white", col.pip = "black", sides = x, ...) flip.rgl.coin(side = sample(2, 1), steps = 150) roll.rgl.die(side = sample(6, 1), steps = 250)
x |
for |
col |
Color of lines on the coin faces. |
heads |
Design to use as "heads" side of coin. |
tails |
Design to use as "tails" side of coin. |
col.cube |
Color of the cube for the die. |
col.pip |
Color of the pips (spots) on the die |
sides |
Vector of length 6 indicating which numbers to show on the die. |
side |
Which side of the coin (1 or 2) or die (1 through 6) should end up face up. |
steps |
The number of steps in each part of the animation, higher values will be smoother and slower, lower values will be faster but more jumpy. |
... |
Currently any additional options are silently ignored. |
You must use the plot function first to create the coin or die, then use the flip or roll function to start the animation. You can animate multiple times for a single use of the plotting function.
You can manually rotate the image as well, see the rgl
package
for details.
The defaults plot a regular coin and die, but arguments are available to create special casses (2 headed coin, die with 2 6's and no 1, ...).
The data list coin.faces
contains information on designs for
the faces of the coins in case you want to choose a different design.
The default rolling and flipping options ranomly choose which side
will be face up following a uniform distribution. You can specify the
side yourself, or use the sample
function to do a biased random
flip/roll.
Which side ended up face up (1 or 2 for coin, 1 through 6 for die).
This is the internal numbering and does not match a change in the
sides
argument.
The current algorithm for animating the die roll shows all the sides, but I am not satisfied with it. Please suggest improvements.
Greg Snow [email protected]
dice
, plot.dice
,
coin.faces
, sample
if(interactive()){ rgl.coin() flip.rgl.coin() flip.rgl.coin(1) flip.rgl.coin(2) rgl.clear() # two-headed coin rgl.coin(tails=coin.faces$qh) rgl.clear() # letters instead of pictures rgl.coin(heads=coin.faces$H, tails=coin.faces$T) # biased flip flip.rgl.coin( sample(2,1, prob=c(0.65, 0.35) ) ) rgl.clear() rgl.die() roll.rgl.die() roll.rgl.die(6) # biased roll roll.rgl.die( sample(6,1, prob=c(1,2,3,3,2,1) ) ) }
if(interactive()){ rgl.coin() flip.rgl.coin() flip.rgl.coin(1) flip.rgl.coin(2) rgl.clear() # two-headed coin rgl.coin(tails=coin.faces$qh) rgl.clear() # letters instead of pictures rgl.coin(heads=coin.faces$H, tails=coin.faces$T) # biased flip flip.rgl.coin( sample(2,1, prob=c(0.65, 0.35) ) ) rgl.clear() rgl.die() roll.rgl.die() roll.rgl.die(6) # biased roll roll.rgl.die( sample(6,1, prob=c(1,2,3,3,2,1) ) ) }
Plots a map (from a Map object from package spData) on a unit sphere in an rgl window that can then be interactively rotated.
rgl.Map(Map, which, ...)
rgl.Map(Map, which, ...)
Map |
An |
which |
Vector indicating the subset of polygons to plot. |
... |
Additional arguments passed on to |
This assumes that the map is cordinates in degrees and plots the map on a unit sphere in an rgl window making a globe. You can then rotate the globe by clicking and dragging in the window.
There is no return value, this function is run for its side effect.
This function is still beta level software (some extra lines show up).
Greg Snow [email protected]
rgl
in package rgl, plot
method in
package sp
if(interactive()){ if(require("spData")) { data(world) rgl.Map(world$geom) spheres3d(0,0,0,.999, col='lightblue') } }
if(interactive()){ if(require("spData")) { data(world) rgl.Map(world$geom) spheres3d(0,0,0,.999, col='lightblue') } }
This demonstration allows you to interactively build a Receiver Operator Curve to better understand what goes into creating them.
roc.demo(x = rnorm(25, 10, 1), y = rnorm(25, 11, 1.5))
roc.demo(x = rnorm(25, 10, 1), y = rnorm(25, 11, 1.5))
x |
Data values for group 1 (controls). |
y |
Data values for group 2 (cases). |
Density plots for the 2 groups will be created in the lower panel of the plot (colored red (group 1) and blue (group 2)) along with rug plots of the actual datapoints. There is also a green vertical line that represents a decision rule cutoff, any points higher than the cutoff are predicted to be in group 2 and points less than the cuttoff are predicted to be in group 1. The sensitivity and specificity for the current cuttoff value are printed below the plot.
A Tk slider box is also created that allows you to move the cuttoff value and update the plots. As the cutoff value changes, the different combinations of sensitivity and specificity are added to the ROC curve in the top panel (the point corresponding to the current cuttoff value is highlighted in red). A line is also drawn from the point representing sensitivity and specificity both equal to 1 to the point closest to it.
No meaninful value is returned, this function is run solely for the side effects.
Greg Snow [email protected]
slider
, ROC
function in package
Epi, auROC
in package limma, package ROC
if(interactive()){ roc.demo() with(CO2, roc.demo(uptake[Type=='Mississippi'], uptake[Type=='Quebec'] ) ) }
if(interactive()){ roc.demo() with(CO2, roc.demo(uptake[Type=='Mississippi'], uptake[Type=='Quebec'] ) ) }
Interactively rotate common 3d plots: cloud, persp, and wireframe.
rotate.cloud(x, ...) rotate.persp(x, y, z) rotate.wireframe(x, ...)
rotate.cloud(x, ...) rotate.persp(x, y, z) rotate.wireframe(x, ...)
x |
|
y |
|
z |
|
... |
additional arguments passed on to |
Use these functions just like cloud
, persp
, and
wireframe
. In addition to the default plot a Tk slider window
will be created that will allow you to rotate the plot.
The rotations parameters are passed the screen
argument of
cloud
and wireframe
and the theta
, phi
,
r
, d
, ltheta
, lphi
, and shade
arguments of persp
.
For cloud
and wireframe
plots the order of the x
,
y
, and z
argumets can be rearanged, just type the
appropriate letters in the boxes on the left, then press the "refresh"
button (changing the order changes the plot for these 2 plots).
These functions are run for the side effects of the plots and Tk windows, nothing meaninful is returned.
Greg Snow [email protected]
cloud
it the lattice package,
persp
, wireframe
in the lattice package
if(interactive()){ rotate.cloud(Sepal.Length ~ Petal.Length*Petal.Width, data=iris) rotate.wireframe(volcano) z <- 2 * volcano # Exaggerate the relief x <- 10 * (1:nrow(z)) # 10 meter spacing (S to N) y <- 10 * (1:ncol(z)) # 10 meter spacing (E to W) rotate.persp(x,y,z) }
if(interactive()){ rotate.cloud(Sepal.Length ~ Petal.Length*Petal.Width, data=iris) rotate.wireframe(volcano) z <- 2 * volcano # Exaggerate the relief x <- 10 * (1:nrow(z)) # 10 meter spacing (S to N) y <- 10 * (1:ncol(z)) # 10 meter spacing (E to W) rotate.persp(x,y,z) }
Make a scatterplot and a Tk slider window that allows you to interactively set the correlation and/or R^2.
run.cor.examp(n=100, seed, vscale=1.5, hscale=1.5, wait=FALSE) run.cor2.examp(n=100, seed, vscale=1.5, hscale=1.5, wait=FALSE) run.old.cor.examp(n = 100, seed) run.old.cor2.examp(n = 100, seed)
run.cor.examp(n=100, seed, vscale=1.5, hscale=1.5, wait=FALSE) run.cor2.examp(n=100, seed, vscale=1.5, hscale=1.5, wait=FALSE) run.old.cor.examp(n = 100, seed) run.old.cor2.examp(n = 100, seed)
n |
Number of points to plot. |
seed |
What seed to use. |
vscale |
Vertical scale passed to tkrplot. |
hscale |
Horizontal scale passed to tkrplot. |
wait |
Should R wait for the tk window to close. |
The function run.cor.examp
draws a scatterplot and allows you
to set the correlation using a Tk slider window.
The function run.cor2.examp
does the same, but has a slider for
R^2 as well as the correlation, when either slider is moved the other
one will update to match.
The 2 "old" versions use the default graphics device with a seperate window with the sliders, the versions without "old" in the name include the plot and sliders together in a single tk window.
The size of the plot can be changed by changing the values in the hscale and vscale boxes and clicking on the "Refresh" button.
If wait
is TRUE, then the return value is a list with the x and
y values of the final plot.
If wait
is FALSE (and in the "old" versions) an invisible NULL
is returned.
If wait
is TRUE then R will wait until you click on the
"Exit" button before you can use your R session again. If wait
is FALSE then the tk window will appear, but R will regain control so
that you can continue to use R as well as interact with the
demonstration window.
Greg Snow [email protected]
if(interactive()) { run.cor2.examp() }
if(interactive()) { run.cor2.examp() }
Create a histogram then use a Tk slider window to change the number of bars, the minimum, and the maximum.
run.hist.demo(x)
run.hist.demo(x)
x |
Data to plot. |
Draws a histogram and creates a Tk slider window that allows you to explore how changing the parameters affects the appearance of the plot.
No meaninful value is returned.
Greg Snow [email protected]
if(interactive()){ run.hist.demo( rnorm(250, 100, 5) ) }
if(interactive()){ run.hist.demo( rnorm(250, 100, 5) ) }
This function demonstrates how to get PPV and NPV from Sensitivity, Specificity, and Prevalence by using a virtual population rather than a direct application of Bayes Rule. This approach is more intuitive to mathphobes.
SensSpec.demo(sens, spec, prev, n = 100000, step = 11)
SensSpec.demo(sens, spec, prev, n = 100000, step = 11)
sens |
Sensitivity (between 0 and 1) |
spec |
Specificity (between 0 and 1) |
prev |
Prevalence (between 0 and 1) |
n |
Size of the virtual population (large round number) |
step |
which step of the process to display |
The common way to compute Positive Predictive Value (probability of disease given a positive test (PPV)) and Negative Predictive Value (probability of no disease given negative test (NPV)) is to use Bayes' rule with the Sensitivity, Specificity, and Prevalence.
This approach can be overwhelming to non-math types, so this demonstration goes through the steps of assuming a virtual population, then filling in a 2x2 table based on the population and given values of Sensitivity, Specificity, and Prevalence. PPV and NPV are then computed from this table. This approach is more intuitive to many people.
The function can be run multiple times with different values of
step
to show the steps in building the table, then rerun with
different values to show how changes in the inputs affect the results.
An invisible matrix with the 2x2 table
Greg Snow, [email protected]
roc.demo
, fagan.plot
,
the various Epi packages, tkexamp
for(i in seq(1,11,2)) { SensSpec.demo(sens=0.95, spec=0.99, prev=0.01, step=i) if( interactive() ) { readline("Press Enter to continue") } }
for(i in seq(1,11,2)) { SensSpec.demo(sens=0.95, spec=0.99, prev=0.01, step=i) if( interactive() ) { readline("Press Enter to continue") } }
This is similar to the text function, but it also puts a background shadow (outline) behind the text to make it stand out from the background better.
shadowtext(x, y = NULL, labels, col = "white", bg = "black", theta = seq(pi/32, 2 * pi, length.out = 64), r = 0.1, cex = 1, ...)
shadowtext(x, y = NULL, labels, col = "white", bg = "black", theta = seq(pi/32, 2 * pi, length.out = 64), r = 0.1, cex = 1, ...)
x |
x-coordinates for the text |
y |
y-coordinates for the text |
labels |
The text labels to plot |
col |
Color of the text |
bg |
Color of the background shadow |
theta |
Angles for plotting the background |
r |
Thickness of the shadow relative to plotting size |
cex |
Character expansion passed through to |
... |
Additional arguments passed on to |
When adding text to a plot it is possible that the color of the text may make it difficult to see relative to its background. If the text spans different backgrounds then it may not be possible to find a single color to give proper contrast.
This function creates a contrasting shadow for the text by first
plotting several copies of the text at angles theta
and
distance r
in the background color, then plotting the text on
top.
This gives a shadowing or outlining effect to the text making it easier to read on any background.
This function is run for its side effects, returns NULL.
Greg Snow, [email protected], with improvements by Thomas Danhorn
plot(1:10, 1:10, bg='aliceblue') rect(3,3,5,8, col='navy') text(5,6, 'Test 1', col='lightsteelblue') shadowtext(5,4, 'Test 2', col='lightsteelblue')
plot(1:10, 1:10, bg='aliceblue') rect(3,3,5,8, col='navy') text(5,6, 'Test 1', col='lightsteelblue') shadowtext(5,4, 'Test 2', col='lightsteelblue')
Compute the test of hypothesis and compute a confidence interval on the variance of a population.
sigma.test(x, sigma = 1, sigmasq = sigma^2, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
sigma.test(x, sigma = 1, sigmasq = sigma^2, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
x |
Vector of data values. |
sigma |
Hypothesized standard deviation of the population. |
sigmasq |
Hypothesized variance of the population. |
alternative |
Direction of the alternative hypothesis. |
conf.level |
Confidence level for the interval computation. |
... |
Additional arguments are silently ignored. |
Many introductory statistical texts discuss inference on a single
population variance and introduce the chi-square test for a population
variance as another example of a hypothesis test that can be easily
derived. Most statistical packages do not include the chi-square
test, perhaps because it is not used in practice very often, or
because the test is known to be highly sensitive to nonnormal
data. For the two-sample problem, see var.test
.
An object of class htest
containing the results
This test is highly sensitive to nonnormality.
G. Jay Kerns [email protected]
x <- rnorm(20, mean = 15, sd = 7) sigma.test(x, sigma = 6)
x <- rnorm(20, mean = 15, sd = 7) sigma.test(x, sigma = 6)
This function is used to create a new function that will simulate data. This could be used by a teacher to create homework or test conditions that the students would then simulate data from (each student could have their own unique data set) or this function could be used in simulations for power or other values of interest.
simfun(expr, drop, ...)
simfun(expr, drop, ...)
expr |
This is an expression, usually just one or more statements, that will generate the simulated data. |
drop |
A character vector of names of objects/columns that will be dropped from the return value. These are usually intermediate objects or parameter values that you don't want carried into the final returned object. |
... |
Additional named items that will be in the environment when |
This function creates another function to simulate data. You supply the general ideas of the simulation to this function and the resulting function can then be used to create simulated datasets. The resulting function can then be given to students for them to simulate datasets, or used localy as part of larger simulations.
The environment where the expression is evaluated will have all the
columns or elements of the data
argument available as well as
the data
argument itself. Any variables/parameters passed
through ...
in the original function will also be available.
You then supply the code based on those variables to create the
simulated data. The names of any columns or parameters submitted as
part of data
will need to match the code exactly (provide
specific directions to the users on what columns need to be named).
Rember that indexing using factors indexes based on the underlying
integers not the character representation. See the examples for
details.
The resulting function can be saved and loaded/attached in different R
sessions (it is important to use save
rather than something
like dput
so that the environment of the function is
preserved).
The function includes an optional seed that will be used with the
char2seed
function (if the seed is a character) so that
each student could use a unique but identifiable seed (such as their
name or something based on their name) so that each student will use a
different dataset, but the instructor will be able to generate the
exact same dataset to check answers.
The "True" parameters are hidden in the environment of the function so the student will not see the "true" values by simply printing the function. However an intermediate level R programmer/user would be able to extract the simulation parameters (but the correct homework or test answer will not be the simulation parameters).
The return value is a function that will generate simulated datasets.
The function will have 2 arguments, data
and seed
. The
data
argument can be either a data frame of the predictor
variables (study design) or a list of simulation parameters. The
seed
argument will be passed on to set.seed
if it
is numeric and char2seed
if it is a character.
The return value of this function is a dataframe with the simulated data and any explanitory variables passed to the function.
See the examples for how to use the result function.
This function was not designed for speed, if you are doing long simulations then hand crafting the simulation function will probably run quicker than one created using this function.
Like the prediction functions the data frame passed in as the data argument will need to have exact names of the columns to match with the code (including capitolization).
This function is different from the simulate
functions
in that it allows for different sample sizes, user specified
parameters, and different predictor variables.
Greg Snow, [email protected]
set.seed
, char2seed
, within
, simulate
, save
, load
, attach
# Create a function to simulate heights for a given dataset simheight <- simfun( {h <- c(64,69); height<-h[sex]+ rnorm(10,0,3)}, drop='h' ) my.df <- data.frame(sex=factor(rep(c('Male','Female'),each=5))) simdat <- simheight(my.df) t.test(height~sex, data=simdat) # a more general version, and have the expression predefined # (note that this assumes that the levels are Female, Male in that order) myexpr <- quote({ n <- length(sex) h <- c(64,69) height <- h[sex] + rnorm(n,0,3) }) simheight <- simfun(eval(myexpr), drop=c('n','h')) my.df <- data.frame(sex=factor(sample(rep(c('Male','Female'),c(5,10))))) (simdat <- simheight(my.df)) # similar to above, but use named parameter vector and index by names myexpr <- quote({ n <- length(sex) height <- h[ as.character(sex)] + rnorm(n,0,sig) }) simheight <- simfun(eval(myexpr), drop=c('n','h','sig'), h=c(Male=69,Female=64), sig=3) my.df <- data.frame(sex=factor(sample(c('Male','Female'),100, replace=TRUE))) (simdat <- simheight(my.df, seed='example')) # Create a function to simulate Sex and Height for a given sample size # (actually it will generate n males and n females for a total of 2*n samples) # then use it in a set of simulations simheight <- simfun( {sex <- factor(rep(c('Male','Female'),each=n)) height <- h[sex] + rnorm(2*n,0,s) }, drop=c('h','n'), h=c(64,69), s=3) (simdat <- simheight(list(n=10))) out5 <- replicate(1000, t.test(height~sex, data=simheight(list(n= 5)))$p.value) out15 <- replicate(1000, t.test(height~sex, data=simheight(list(n=15)))$p.value) mean(out5 <= 0.05) mean(out15 <= 0.05) # use a fixed population simstate <- simfun({ tmp <- state.df[as.character(State),] Population <- tmp[['Population']] Income <- tmp[['Income']] Illiteracy <- tmp[['Illiteracy']] }, state.df=as.data.frame(state.x77), drop=c('tmp','state.df')) simstate(data.frame(State=sample(state.name,10))) # Use simulation, but override setting the seed simheight <- simfun({ set.seed(1234) h <- c(64,69) sex <- factor(rep(c('Female','Male'),each=50)) height <- round(rnorm(100, rep(h,each=50),3),1) sex <- sex[ID] height <- height[ID] }, drop='h') (newdat <- simheight(list(ID=c(1:5,51:55)))) (newdat2<- simheight(list(ID=1:10))) # Using a fitted object fit <- lm(Fertility ~ . , data=swiss) simfert <- simfun({ Fertility <- predict(fit, newdata=data) Fertility <- Fertility + rnorm(length(Fertility),0,summary(fit)$sigma) }, drop=c('fit'), fit=fit) tmpdat <- as.data.frame(lapply(swiss[,-1], function(x) round(runif(100, min(x), max(x))))) names(tmpdat) <- names(swiss)[-1] fertdat <- simfert(tmpdat) head(fertdat) rbind(coef(fit), coef(lm(Fertility~., data=fertdat))) # simulate a nested mixed effects model simheight <- simfun({ n.city <- length(unique(city)) n.state <- length(unique(state)) n <- length(city) height <- h[sex] + rnorm(n.state,0,sig.state)[state] + rnorm(n.city,0,sig.city)[city] + rnorm(n,0,sig.e) }, sig.state=1, sig.city=0.5, sig.e=3, h=c(64,69), drop=c('sig.state','sig.city','sig.e','h','n.city','n.state','n')) tmpdat <- data.frame(state=gl(5,20), city=gl(10,10), sex=gl(2,5,length=100, labels=c('F','M'))) heightdat <- simheight(tmpdat) # similar to above, but include cost information, this assumes that # each new state costs $100, each new city is $10, and each subject is $1 # this shows 2 possible methods simheight <- simfun({ n.city <- length(unique(city)) n.state <- length(unique(state)) n <- length(city) height <- h[sex] + rnorm(n.state,0,sig.state)[state] + rnorm(n.city,0,sig.city)[city] + rnorm(n,0,sig.e) cost <- 100 * (!duplicated(state)) + 10*(!duplicated(city)) + 1 cat('The total cost for this design is $', 100*n.state+10*n.city+1*n, '\n', sep='') }, sig.state=1, sig.city=0.5, sig.e=3, h=c(64,69), drop=c('sig.state','sig.city','sig.e','h','n.city','n.state','n')) tmpdat <- data.frame(state=gl(5,20), city=gl(10,10), sex=gl(2,5,length=100, labels=c('F','M'))) heightdat <- simheight(tmpdat) sum(heightdat$cost) # another mixed model method simheight <- simfun({ state <- gl(n.state, n/n.state) city <- gl(n.city*n.state, n/n.city/n.state) sex <- gl(2, n.city, length=n, labels=c('F','M') ) height <- h[sex] + rnorm(n.state,0,sig.state)[state] + rnorm(n.city*n.state,0,sig.city)[city] + rnorm(n,0,sig.e) }, drop=c('n.state','n.city','n','sig.city','sig.state','sig.e','h')) heightdat <- simheight( list( n.state=5, n.city=2, n=100, sig.state=10, sig.city=3, sig.e=1, h=c(64,69) ))
# Create a function to simulate heights for a given dataset simheight <- simfun( {h <- c(64,69); height<-h[sex]+ rnorm(10,0,3)}, drop='h' ) my.df <- data.frame(sex=factor(rep(c('Male','Female'),each=5))) simdat <- simheight(my.df) t.test(height~sex, data=simdat) # a more general version, and have the expression predefined # (note that this assumes that the levels are Female, Male in that order) myexpr <- quote({ n <- length(sex) h <- c(64,69) height <- h[sex] + rnorm(n,0,3) }) simheight <- simfun(eval(myexpr), drop=c('n','h')) my.df <- data.frame(sex=factor(sample(rep(c('Male','Female'),c(5,10))))) (simdat <- simheight(my.df)) # similar to above, but use named parameter vector and index by names myexpr <- quote({ n <- length(sex) height <- h[ as.character(sex)] + rnorm(n,0,sig) }) simheight <- simfun(eval(myexpr), drop=c('n','h','sig'), h=c(Male=69,Female=64), sig=3) my.df <- data.frame(sex=factor(sample(c('Male','Female'),100, replace=TRUE))) (simdat <- simheight(my.df, seed='example')) # Create a function to simulate Sex and Height for a given sample size # (actually it will generate n males and n females for a total of 2*n samples) # then use it in a set of simulations simheight <- simfun( {sex <- factor(rep(c('Male','Female'),each=n)) height <- h[sex] + rnorm(2*n,0,s) }, drop=c('h','n'), h=c(64,69), s=3) (simdat <- simheight(list(n=10))) out5 <- replicate(1000, t.test(height~sex, data=simheight(list(n= 5)))$p.value) out15 <- replicate(1000, t.test(height~sex, data=simheight(list(n=15)))$p.value) mean(out5 <= 0.05) mean(out15 <= 0.05) # use a fixed population simstate <- simfun({ tmp <- state.df[as.character(State),] Population <- tmp[['Population']] Income <- tmp[['Income']] Illiteracy <- tmp[['Illiteracy']] }, state.df=as.data.frame(state.x77), drop=c('tmp','state.df')) simstate(data.frame(State=sample(state.name,10))) # Use simulation, but override setting the seed simheight <- simfun({ set.seed(1234) h <- c(64,69) sex <- factor(rep(c('Female','Male'),each=50)) height <- round(rnorm(100, rep(h,each=50),3),1) sex <- sex[ID] height <- height[ID] }, drop='h') (newdat <- simheight(list(ID=c(1:5,51:55)))) (newdat2<- simheight(list(ID=1:10))) # Using a fitted object fit <- lm(Fertility ~ . , data=swiss) simfert <- simfun({ Fertility <- predict(fit, newdata=data) Fertility <- Fertility + rnorm(length(Fertility),0,summary(fit)$sigma) }, drop=c('fit'), fit=fit) tmpdat <- as.data.frame(lapply(swiss[,-1], function(x) round(runif(100, min(x), max(x))))) names(tmpdat) <- names(swiss)[-1] fertdat <- simfert(tmpdat) head(fertdat) rbind(coef(fit), coef(lm(Fertility~., data=fertdat))) # simulate a nested mixed effects model simheight <- simfun({ n.city <- length(unique(city)) n.state <- length(unique(state)) n <- length(city) height <- h[sex] + rnorm(n.state,0,sig.state)[state] + rnorm(n.city,0,sig.city)[city] + rnorm(n,0,sig.e) }, sig.state=1, sig.city=0.5, sig.e=3, h=c(64,69), drop=c('sig.state','sig.city','sig.e','h','n.city','n.state','n')) tmpdat <- data.frame(state=gl(5,20), city=gl(10,10), sex=gl(2,5,length=100, labels=c('F','M'))) heightdat <- simheight(tmpdat) # similar to above, but include cost information, this assumes that # each new state costs $100, each new city is $10, and each subject is $1 # this shows 2 possible methods simheight <- simfun({ n.city <- length(unique(city)) n.state <- length(unique(state)) n <- length(city) height <- h[sex] + rnorm(n.state,0,sig.state)[state] + rnorm(n.city,0,sig.city)[city] + rnorm(n,0,sig.e) cost <- 100 * (!duplicated(state)) + 10*(!duplicated(city)) + 1 cat('The total cost for this design is $', 100*n.state+10*n.city+1*n, '\n', sep='') }, sig.state=1, sig.city=0.5, sig.e=3, h=c(64,69), drop=c('sig.state','sig.city','sig.e','h','n.city','n.state','n')) tmpdat <- data.frame(state=gl(5,20), city=gl(10,10), sex=gl(2,5,length=100, labels=c('F','M'))) heightdat <- simheight(tmpdat) sum(heightdat$cost) # another mixed model method simheight <- simfun({ state <- gl(n.state, n/n.state) city <- gl(n.city*n.state, n/n.city/n.state) sex <- gl(2, n.city, length=n, labels=c('F','M') ) height <- h[sex] + rnorm(n.state,0,sig.state)[state] + rnorm(n.city*n.state,0,sig.city)[city] + rnorm(n,0,sig.e) }, drop=c('n.state','n.city','n','sig.city','sig.state','sig.e','h')) heightdat <- simheight( list( n.state=5, n.city=2, n=100, sig.state=10, sig.city=3, sig.e=1, h=c(64,69) ))
slider
constructs a Tcl/Tk-widget with sliders and buttons
automated calculation and plotting. For example slider allows complete
all axes rotation of objects in a plot.
slider(sl.functions, sl.names, sl.mins, sl.maxs, sl.deltas, sl.defaults, but.functions, but.names, no, set.no.value, obj.name, obj.value, reset.function, title)
slider(sl.functions, sl.names, sl.mins, sl.maxs, sl.deltas, sl.defaults, but.functions, but.names, no, set.no.value, obj.name, obj.value, reset.function, title)
sl.functions |
set of functions or function connected to the slider(s) |
sl.names |
labels of the sliders |
sl.mins |
minimum values of the sliders' ranges |
sl.maxs |
maximum values of the sliders' ranges |
sl.deltas |
change of step per click |
sl.defaults |
default values for the sliders |
but.functions |
function or list of functions that are assigned to the button(s) |
but.names |
labels of the buttons |
no |
|
set.no.value |
|
obj.name |
|
obj.value |
|
reset.function |
function that comprises the commands of the
|
title |
title of the control window |
With slider you can:
a. define (multiple) sliders and buttons,
b. request or set slider values, and
c. request or set variables in the environment slider.env
.
Slider function management takes place in the
environment slider.env
. If slider.env
is not found it is generated.
Definition: ... of sliders: First of all you have to define sliders, buttons and the attributes of them.
Sliders are established by six arguments:
sl.functions, sl.names, sl.minima, sl.maxima,sl.deltas
, and sl.defaults
.
The first argument,
sl.functions
, is either a list of functions or a single function that
entails the commands for the sliders.
If there are three sliders and slider 2 is moved with the mouse the function stored in
sl.functions[[2]]
(or in case of one function for all sliders the function sl.functions
)
is called.
Definition: ... of buttons: Buttons are defined by a vector of labels but.names
and a list of functions:
but.functions
. If button i
is pressed the function stored in but.functions[[i]]
is called.
Requesting: ... a slider:
slider(no=1)
returns the actual value of slider 1,
slider(no=2)
returns the value of slider 2, etc.
You are allowed to include expressions of the type
slider(no=i)
in functions describing the effect of
sliders or buttons.
Setting: ... a slider: slider(set.no.value=c(2,333))
sets slider 2
to value 333.
slider(set.no.value=c(i,value))
can be included in the functions
defining the effects of moving sliders or pushing buttons.
Variables: ... of the environment slider.env
: Sometimes information
has to be trransferred back and forth between functions defining the effects of sliders and buttons.
Imagine for example two sliders: one to control p
and another one to control q
,
but they should satisfy: p+q=1
. Consequently, you have to correct the value of the first slider
after the second one was moved. To prevent the creation of global
variables store them in the environment slider.env
.
Use slider(obj.name="p.save",obj.value=1-slider(no=2))
to assign value 1-slider(no=2)
to the variable p.save
.
slider(obj.name=p.save)
returns the value of variable p.save
.
Using slider
in definition mode slider
returns the value of new created the top level widget.
slider(no=i)
returns the actual value of slider i
.
slider(obj.name=name)
returns the value of variable name
in environment slider.env
.
You can move the slider in 3 different ways: You can left click and drag the slider itself, you can left click in the trough to either side of the slider and the slider will move 1 unit in the direction you clicked, or you can right click in the trough and the slider will jump to the location you clicked at.
This function may not stay in this package (consider it
semi-depricated), the original of the slider function is in the relax
package and can be used from there. In TeachingDemos the
tkexamp
function is taking the place of slider
and gives a possibly
more general approach.
Hans Peter Wolf
# example 1, sliders only ## Not run: ## This example cannot be run by examples() but should work in an interactive R session plot.sample.norm<-function(){ refresh.code<-function(...){ mu<-slider(no=1); sd<-slider(no=1); n<-slider(no=3) x<-rnorm(n,mu,sd) plot(x) } slider(refresh.code,sl.names=c("value of mu","value of sd","n number of observations"), sl.mins=c(-10,.01,5),sl.maxs=c(+10,50,100),sl.deltas=c(.01,.01,1),sl.defaults=c(0,1,20)) } plot.sample.norm() ## End(Not run) # example 2, sliders and buttons ## Not run: ## This example cannot be run by examples() but should work in an interactive R session plot.sample.norm.2<-function(){ refresh.code<-function(...){ mu<-slider(no=1); sd<-slider(no=2); n<-slider(no=3) type= slider(obj.name="type") x<-rnorm(n,mu,sd) plot(seq(x),x,ylim=c(-20,20),type=type) } slider(refresh.code,sl.names=c("value of mu","value of sd","n number of observations"), sl.mins=c(-10,.01,5),sl.maxs=c(10,10,100),sl.deltas=c(.01,.01,1),sl.defaults=c(0,1,20), but.functions=list( function(...){slider(obj.name="type",obj.value="l");refresh.code()}, function(...){slider(obj.name="type",obj.value="p");refresh.code()}, function(...){slider(obj.name="type",obj.value="b");refresh.code()} ), but.names=c("lines","points","both")) slider(obj.name="type",obj.value="l") } plot.sample.norm.2() ## End(Not run) # example 3, dependent sliders ## Not run: ## This example cannot be run by examples() but should work in an interactive R session print.of.p.and.q<-function(){ refresh.code<-function(...){ p.old<-slider(obj.name="p.old") p<-slider(no=1); if(abs(p-p.old)>0.001) {slider(set.no.value=c(2,1-p))} q<-slider(no=2); if(abs(q-(1-p))>0.001) {slider(set.no.value=c(1,1-q))} slider(obj.name="p.old",obj.value=p) cat("p=",p,"q=",1-p,"\n") } slider(refresh.code,sl.names=c("value of p","value of q"), sl.mins=c(0,0),sl.maxs=c(1,1),sl.deltas=c(.01,.01),sl.defaults=c(.2,.8)) slider(obj.name="p.old",obj.value=slider(no=1)) } print.of.p.and.q() ## End(Not run) # example 4, rotating a surface ## Not run: ## This example cannot be run by examples() but should work in an interactive R session R.veil.in.the.wind<-function(){ # Mark Hempelmann / Peter Wolf par(bg="blue4", col="white", col.main="white", col.sub="white", font.sub=2, fg="white") # set colors and fonts samp <- function(N,D) N*(1/4+D)/(1/4+D*N) z<-outer(seq(1, 800, by=10), seq(.0025, 0.2, .0025)^2/1.96^2, samp) # create 3d matrix h<-100 z[10:70,20:25]<-z[10:70,20:25]+h; z[65:70,26:45]<-z[65:70,26:45]+h z[64:45,43:48]<-z[64:45,43:48]+h; z[44:39,26:45]<-z[44:39,26:45]+h x<-26:59; y<-11:38; zz<-outer(x,y,"+"); zz<-zz*(65<zz)*(zz<73) cz<-10+col(zz)[zz>0];rz<-25+row(zz)[zz>0]; z[cbind(cz,rz)]<-z[cbind(cz,rz)]+h refresh.code<-function(...){ theta<-slider(no=1); phi<-slider(no=2) persp(x=seq(1,800,by=10),y=seq(.0025,0.2,.0025),z=z,theta=theta,phi=phi, scale=T, shade=.9, box=F, ltheta = 45, lphi = 45, col="aquamarine", border="NA",ticktype="detailed") } slider(refresh.code, c("theta", "phi"), c(0, 0),c(360, 360),c(.2, .2),c(85, 270) ) } R.veil.in.the.wind() ## End(Not run)
# example 1, sliders only ## Not run: ## This example cannot be run by examples() but should work in an interactive R session plot.sample.norm<-function(){ refresh.code<-function(...){ mu<-slider(no=1); sd<-slider(no=1); n<-slider(no=3) x<-rnorm(n,mu,sd) plot(x) } slider(refresh.code,sl.names=c("value of mu","value of sd","n number of observations"), sl.mins=c(-10,.01,5),sl.maxs=c(+10,50,100),sl.deltas=c(.01,.01,1),sl.defaults=c(0,1,20)) } plot.sample.norm() ## End(Not run) # example 2, sliders and buttons ## Not run: ## This example cannot be run by examples() but should work in an interactive R session plot.sample.norm.2<-function(){ refresh.code<-function(...){ mu<-slider(no=1); sd<-slider(no=2); n<-slider(no=3) type= slider(obj.name="type") x<-rnorm(n,mu,sd) plot(seq(x),x,ylim=c(-20,20),type=type) } slider(refresh.code,sl.names=c("value of mu","value of sd","n number of observations"), sl.mins=c(-10,.01,5),sl.maxs=c(10,10,100),sl.deltas=c(.01,.01,1),sl.defaults=c(0,1,20), but.functions=list( function(...){slider(obj.name="type",obj.value="l");refresh.code()}, function(...){slider(obj.name="type",obj.value="p");refresh.code()}, function(...){slider(obj.name="type",obj.value="b");refresh.code()} ), but.names=c("lines","points","both")) slider(obj.name="type",obj.value="l") } plot.sample.norm.2() ## End(Not run) # example 3, dependent sliders ## Not run: ## This example cannot be run by examples() but should work in an interactive R session print.of.p.and.q<-function(){ refresh.code<-function(...){ p.old<-slider(obj.name="p.old") p<-slider(no=1); if(abs(p-p.old)>0.001) {slider(set.no.value=c(2,1-p))} q<-slider(no=2); if(abs(q-(1-p))>0.001) {slider(set.no.value=c(1,1-q))} slider(obj.name="p.old",obj.value=p) cat("p=",p,"q=",1-p,"\n") } slider(refresh.code,sl.names=c("value of p","value of q"), sl.mins=c(0,0),sl.maxs=c(1,1),sl.deltas=c(.01,.01),sl.defaults=c(.2,.8)) slider(obj.name="p.old",obj.value=slider(no=1)) } print.of.p.and.q() ## End(Not run) # example 4, rotating a surface ## Not run: ## This example cannot be run by examples() but should work in an interactive R session R.veil.in.the.wind<-function(){ # Mark Hempelmann / Peter Wolf par(bg="blue4", col="white", col.main="white", col.sub="white", font.sub=2, fg="white") # set colors and fonts samp <- function(N,D) N*(1/4+D)/(1/4+D*N) z<-outer(seq(1, 800, by=10), seq(.0025, 0.2, .0025)^2/1.96^2, samp) # create 3d matrix h<-100 z[10:70,20:25]<-z[10:70,20:25]+h; z[65:70,26:45]<-z[65:70,26:45]+h z[64:45,43:48]<-z[64:45,43:48]+h; z[44:39,26:45]<-z[44:39,26:45]+h x<-26:59; y<-11:38; zz<-outer(x,y,"+"); zz<-zz*(65<zz)*(zz<73) cz<-10+col(zz)[zz>0];rz<-25+row(zz)[zz>0]; z[cbind(cz,rz)]<-z[cbind(cz,rz)]+h refresh.code<-function(...){ theta<-slider(no=1); phi<-slider(no=2) persp(x=seq(1,800,by=10),y=seq(.0025,0.2,.0025),z=z,theta=theta,phi=phi, scale=T, shade=.9, box=F, ltheta = 45, lphi = 45, col="aquamarine", border="NA",ticktype="detailed") } slider(refresh.code, c("theta", "phi"), c(0, 0),c(360, 360),c(.2, .2),c(85, 270) ) } R.veil.in.the.wind() ## End(Not run)
Create a Tk slider window with the sliders positioned vertically instead of horizontally.
sliderv(refresh.code, names, minima, maxima, resolutions, starts, title = "control", no = 0, set.no.value = 0)
sliderv(refresh.code, names, minima, maxima, resolutions, starts, title = "control", no = 0, set.no.value = 0)
refresh.code |
Function to be called when sliders are moved. |
names |
Labels for the sliders. |
minima |
Vector of minimum values for the sliders. |
maxima |
Vector of maximum values for the sliders. |
resolutions |
Vector of resolutions for the sliders. |
starts |
Vector of starting values for the sliders. |
title |
Title to put at the top of the Tk box. |
no |
The number of the slider whose value you want. |
set.no.value |
Vector of length 2 with the number of slider to set and the new value. |
This is a variation on the slider
function with vertical sliders
arranged in a row rather than horizontal sliders arranged in a column.
This is based on an early version of slider
and therefore does
not have as many bells and whistles (but sometimes fits the screen better).
Returns the value of a given slider when used as: slider(no=i)
.
You can move the slider in 3 different ways: You can left click and drag the slider itself, you can left click in the trough to either side of the slider and the slider will move 1 unit in the direction you clicked, or you can right click in the trough and the slider will jump to the location you clicked at.
This function may not stay in this package (consider it
semi-depricated). See the tkexamp
function for another
approach to do the same thing.
Greg Snow [email protected]
if(interactive()){ face.refresh <- function(...){ vals <- sapply(1:15, function(x) slider(no=x)) faces( rbind(0, vals, 1), scale=F) } sliderv( face.refresh, as.character(1:15), rep(0,15), rep(1,15), rep(0.05, 15), rep(0.5,15), title='Face Demo') }
if(interactive()){ face.refresh <- function(...){ vals <- sapply(1:15, function(x) slider(no=x)) faces( rbind(0, vals, 1), scale=F) } sliderv( face.refresh, as.character(1:15), rep(0,15), rep(1,15), rep(0.05, 15), rep(0.5,15), title='Face Demo') }
This is a hypothesis test designed to be correctly sized in that the probability of rejecting the null when it is true will be equal to your alpha level. Other than that it is a pretty useless test mainly intended for when people say something like "I just need a p-value".
SnowsCorrectlySizedButOtherwiseUselessTestOfAnything(x, data.name = deparse(substitute(x)), alternative = "You Are Lucky", ..., seed)
SnowsCorrectlySizedButOtherwiseUselessTestOfAnything(x, data.name = deparse(substitute(x)), alternative = "You Are Lucky", ..., seed)
x |
The data, or nothing, or something equally irrelevant |
data.name |
The name of the data for the output |
alternative |
The phrase for the alternate hypothesis in the output |
... |
Additional arguments that will be silently ignored (like
|
seed |
A seed (numeric or character) used to seed the random number generator. Use this or manually set the seed if you want reproducible (but still meaningless) results |
Some of the advantages/disadvantages of this test include:
The probability of a Type I error is alpha
Power can be easily computed (it is alpha)
Power is independent of the sample size
Power is independent of the hypotheses
This test is not affected by missing data (present data either)
This test does not depend on any distributional or independence assumptions
An object of class htest with the following elements:
p.value |
The p-value |
statistic |
The test statistic (identical to the p-value) |
data.name |
The name of the data (if any) |
method |
The name of the test |
alternative |
a phrase representing the alternative hypothesis |
seed |
optionally the seed that was used |
If someone has suggested that you consider this test, they most likely do not intend for you to actually use the test, rather to reconsider your question or the assumptions that you are making or trying to avoid. This test should only be used to illustrate a point and decisions (other than maybe who should pay for lunch) should never be made based on the results of this test.
Greg Snow [email protected]
The author is unlikely to be willing to publish in any "journal" that would be willing to publish this test.
fortune(264)
SnowsCorrectlySizedButOtherwiseUselessTestOfAnything(log(rnorm(100)))
SnowsCorrectlySizedButOtherwiseUselessTestOfAnything(log(rnorm(100)))
This function tests the null hypothesis that the data comes from an exact normal population. This is a much less interesting/useful null than what people usually want, which is to know if the data come from a distribution that is similar enough to the normal to use normal theory inference.
SnowsPenultimateNormalityTest(x)
SnowsPenultimateNormalityTest(x)
x |
The data |
The theory for this test is based on the probability of getting a rational number from a truly continuous distribution defined on the reals.
The main goal of this test is to quickly give a p-value for those that feel it necessary to test the uninteresting and uninformative null hypothesis that the data represents an exact normal, and allows the user to then move on to much more important questions, like "is the data close enough to the normal to use normal theory inference?".
After running this test (or better instead of running this and any other test of normality) you should ask yourself what it means to test for normality and why you would want to do so. Then plot the data and explore the interesting/useful questions.
An object of class "htest" with components:
p.value |
The p-value |
alternative |
a string representing the alternative hypothesis |
method |
a string describing the method |
data.name |
a string describing the name of the data |
Note: if you just use this function and report the p-value then the function has failed in its purpose. If this function helps you to think about your analysis and what question(s) you are really interested in, create meaningful plots, and focus on the more meaningful parts of research, then it has succeeded. See also Cochrane's Aphorism.
Greg Snow [email protected]
fortune(234)
SnowsPenultimateNormalityTest(rt(100,25))
SnowsPenultimateNormalityTest(rt(100,25))
This function takes as set of coordinates and spreads out the close values so that they can be used in labeling plots without overlapping.
spread.labs(x, mindiff, maxiter = 1000, stepsize = 1/10, min = -Inf, max = Inf)
spread.labs(x, mindiff, maxiter = 1000, stepsize = 1/10, min = -Inf, max = Inf)
x |
The coordinate values (x or y, not both) to spread out. |
mindiff |
The minimum distance between return values |
maxiter |
The maximum number of iterations |
stepsize |
How far to move values in each iteration |
min |
Minimum bound for returned values |
max |
Maximum bound for returned values |
Sometimes the desired locations for labels in plots results in the
labels overlapping. This function takes the coordinate values (x or
y, not both) and finds those points that are less than mindiff
(usually a function of strheight
or strwidth
) apart and
increases the space between them (by stepsize
*
mindiff
). This may or may not be enough and moving some points
away from their nearest neighbor may move them too close to another
neighbor, so the process is iterated until either maxiter
steps
have been tried, or all the values are at least mindiff
apart.
The min
and max
arguments prevent the values from going
outside that range (they should be specified such that the original
values are all inside the range).
The values do not need to be presorted.
A vector of coordinates (order corresponding to the original x
)
that can be used as a replacement for x
in placing labels.
Greg Snow, [email protected]
text
, the spread.labels
function in the
plotrix
package.
# overlapping labels plot(as.integer(state.region), state.x77[,1], ylab='Population', xlab='Region',xlim=c(1,4.75), xaxt='n') axis(1, at=1:4, lab=levels(state.region) ) text( as.integer(state.region)+.5, state.x77[,1], state.abb ) segments( as.integer(state.region)+0.025, state.x77[,1], as.integer(state.region)+.375, state.x77[,1] ) # now lets redo the plot without overlap tmp.y <- state.x77[,1] for(i in levels(state.region) ) { tmp <- state.region == i tmp.y[ tmp ] <- spread.labs( tmp.y[ tmp ], 1.2*strheight('A'), maxiter=1000, min=0 ) } plot(as.integer(state.region), state.x77[,1], ylab='Population', xlab='Region', xlim=c(1,4.75), xaxt='n') axis(1, at=1:4, lab=levels(state.region) ) text( as.integer(state.region)+0.5, tmp.y, state.abb ) segments( as.integer(state.region)+0.025, state.x77[,1], as.integer(state.region)+0.375, tmp.y )
# overlapping labels plot(as.integer(state.region), state.x77[,1], ylab='Population', xlab='Region',xlim=c(1,4.75), xaxt='n') axis(1, at=1:4, lab=levels(state.region) ) text( as.integer(state.region)+.5, state.x77[,1], state.abb ) segments( as.integer(state.region)+0.025, state.x77[,1], as.integer(state.region)+.375, state.x77[,1] ) # now lets redo the plot without overlap tmp.y <- state.x77[,1] for(i in levels(state.region) ) { tmp <- state.region == i tmp.y[ tmp ] <- spread.labs( tmp.y[ tmp ], 1.2*strheight('A'), maxiter=1000, min=0 ) } plot(as.integer(state.region), state.x77[,1], ylab='Population', xlab='Region', xlim=c(1,4.75), xaxt='n') axis(1, at=1:4, lab=levels(state.region) ) text( as.integer(state.region)+0.5, tmp.y, state.abb ) segments( as.integer(state.region)+0.025, state.x77[,1], as.integer(state.region)+0.375, tmp.y )
Adjusts the plotting area to a specific aspect ratio. This is
different from using the asp
argument in that it puts the extra
space in the margins rather than inside the plotting region.
squishplot(xlim, ylim, asp = 1, newplot=TRUE)
squishplot(xlim, ylim, asp = 1, newplot=TRUE)
xlim |
The x limits of the plot, or the entire x vector. |
ylim |
The y limits of the plot, or the entire y vector. |
asp |
The y/x aspect ratio. |
newplot |
Should plot.new() be called before making the calculations. |
This function sets the plot area of the current graph device so that the following plot command will plot with the specified aspect ratio.
This is different from using the asp
argument to
plot.default
in where the created white space goes (see the
example). Using plot.default
will place the whitespace within
the plotting region and can result in the axes and annotations being
quite far from the actual data. This command sets up the plotting
region so that the extra whitespace is in the margin areas and moves
the axes and annotations close to the data.
Any other desired parameter settings or resizing of the graphics
device should be set before calling
squishplot
, especially settings dealing with multiple figures
or margin areas.
After plotting, the parameters need to be reset or later plots may come out wrong.
Invisible list containing the 'plt
' values from par
that
were in place before the call to squishplot
that can be used to
reset the graphical parameters after plotting is finished.
Remember to set other graphical parameters, then call
squishplot
, then call the plotting function(s), then reset the
parameters.
Greg Snow [email protected]
plot.default
, plot.window
,
par
x <- rnorm(25, 10, 2 ) y <- 5 + 1.5*x + rnorm(25,0,2) par(mfrow=c(1,3)) plot(x,y) op <- squishplot(x,y,1) plot(x,y) par(op) plot(x,y, asp=1)
x <- rnorm(25, 10, 2 ) y <- 5 + 1.5*x + rnorm(25,0,2) par(mfrow=c(1,3)) plot(x,y) op <- squishplot(x,y,1) plot(x,y) par(op) plot(x,y, asp=1)
This is the export from a pedometer worn for nearly about 11 months by the package author.
data(steps)
data(steps)
A data frame with 331 observations on the following 79 variables.
Date
The Date for the given data
Total.Steps
Total Steps recorded for the day
Aerobic.Steps
Total Aerobic Steps recorded for the day (see below)
Aerobic.Walking.Time
Time spent in aerobic walking for the day
Calories
Estimated calories burned for the day
Distance
Estimated distance walked for the day in miles
Fat.Burned
Estimated grams of fat burned by walking
Steps.12AM
Steps recorded between Midnight and 1 am
Steps.1AM
Steps recorded between 1 am and 2 am
Steps.2AM
Steps recorded between 2 am and 3 am
Steps.3AM
Steps recorded between 3 am and 4 am
Steps.4AM
Steps recorded between 4 am and 5 am
Steps.5AM
Steps recorded between 5 am and 6 am
Steps.6AM
Steps recorded between 6 am and 7 am
Steps.7AM
Steps recorded between 7 am and 8 am
Steps.8AM
Steps recorded between 8 am and 9 am
Steps.9AM
Steps recorded between 9 am and 10 am
Steps.10AM
Steps recorded between 10 am and 11 am
Steps.11AM
Steps recorded between 11 am and Noon
Steps.12PM
Steps recorded between Noon and 1 pm
Steps.1PM
Steps recorded between 1 pm and 2 pm
Steps.2PM
Steps recorded between 2 pm and 3 pm
Steps.3PM
Steps recorded between 3 pm and 4 pm
Steps.4PM
Steps recorded between 4 pm and 5 pm
Steps.5PM
Steps recorded between 5 pm and 6 pm
Steps.6PM
Steps recorded between 6 pm and 7 pm
Steps.7PM
Steps recorded between 7 pm and 8 pm
Steps.8PM
Steps recorded between 8 pm and 9 pm
Steps.9PM
Steps recorded between 9 pm and 10 pm
Steps.10PM
Steps recorded between 10 pm and 11 pm
Steps.11PM
Steps recorded between 11 pm and Midnight
Aerobic.Steps.12AM
Aerobic steps recorded between Midnight and 1 am
Aerobic.Steps.1AM
Aerobic steps recorded between 1 am and 2 am
Aerobic.Steps.2AM
Aerobic steps recorded between 2 am and 3 am
Aerobic.Steps.3AM
Aerobic steps recorded between 3 am and 4 am
Aerobic.Steps.4AM
Aerobic steps recorded between 4 am and 5 am
Aerobic.Steps.5AM
Aerobic steps recorded between 5 am and 6 am
Aerobic.Steps.6AM
Aerobic steps recorded between 6 am and 7 am
Aerobic.Steps.7AM
Aerobic steps recorded between 7 am and 8 am
Aerobic.Steps.8AM
Aerobic steps recorded between 8 am and 9 am
Aerobic.Steps.9AM
Aerobic steps recorded between 9 am and 10 am
Aerobic.Steps.10AM
Aerobic steps recorded between 10 am and 11 am
Aerobic.Steps.11AM
Aerobic steps recorded between 11 am and Noon
Aerobic.Steps.12PM
Aerobic steps recorded between Noon and 1 pm
Aerobic.Steps.1PM
Aerobic steps recorded between 1 pm and 2 pm
Aerobic.Steps.2PM
Aerobic steps recorded between 2 pm and 3 pm
Aerobic.Steps.3PM
Aerobic steps recorded between 3 pm and 4 pm
Aerobic.Steps.4PM
Aerobic steps recorded between 4 pm and 5 pm
Aerobic.Steps.5PM
Aerobic steps recorded between 5 pm and 6 pm
Aerobic.Steps.6PM
Aerobic steps recorded between 6 pm and 7 pm
Aerobic.Steps.7PM
Aerobic steps recorded between 7 pm and 8 pm
Aerobic.Steps.8PM
Aerobic steps recorded between 8 pm and 9 pm
Aerobic.Steps.9PM
Aerobic steps recorded between 9 pm and 10 pm
Aerobic.Steps.10PM
Aerobic steps recorded between 10 pm and 11 pm
Aerobic.Steps.11PM
Aerobic steps recorded between 11 pm and Midnight
Used.12AM
Any movement detected between Midnight and 1 am
Used.1AM
Any movement detected between 1 am and 2 am
Used.2AM
Any movement detected between 2 am and 3 am
Used.3AM
Any movement detected between 3 am and 4 am
Used.4AM
Any movement detected between 4 am and 5 am
Used.5AM
Any movement detected between 5 am and 6 am
Used.6AM
Any movement detected between 6 am and 7 am
Used.7AM
Any movement detected between 7 am and 8 am
Used.8AM
Any movement detected between 8 am and 9 am
Used.9AM
Any movement detected between 9 am and 10 am
Used.10AM
Any movement detected between 10 am and 11 am
Used.11AM
Any movement detected between 11 am and Noon
Used.12PM
Any movement detected between Noon and 1 pm
Used.1PM
Any movement detected between 1 pm and 2 pm
Used.2PM
Any movement detected between 2 pm and 3 pm
Used.3PM
Any movement detected between 3 pm and 4 pm
Used.4PM
Any movement detected between 4 pm and 5 pm
Used.5PM
Any movement detected between 5 pm and 6 pm
Used.6PM
Any movement detected between 6 pm and 7 pm
Used.7PM
Any movement detected between 7 pm and 8 pm
Used.8PM
Any movement detected between 8 pm and 9 pm
Used.9PM
Any movement detected between 9 pm and 10 pm
Used.10PM
Any movement detected between 10 pm and 11 pm
Used.11PM
Any movement detected between 11 pm and Midnight
data(steps) ## maybe str(steps) ; plot(steps) ...
data(steps) ## maybe str(steps) ; plot(steps) ...
Data invented by Neyman to look at spurious correlations and adjusting for lurking variables by looking at the relationship between storks and biths.
data(stork)
data(stork)
A data frame with 54 observations on the following 6 variables.
County
ID of county
Women
Number of Women (*10,000)
No.storks
Number of Storks sighted
No.babies
Number of Babies Born
Stork.rate
Storks per 10,000 women (=No.storks/Women)
Birth.rate
Babies per 10,000 women (=No.babies/Women)
This is an entertaining example to show a relationship that is due to a third possibly lurking variable. The source paper shows how completely different relationships can be found by mis-analyzing the data.
Kronmal, Richard A. (1993) Spurious Cerrolation and the Fallacy of the Ratio Standard Revisited. Journal of the Royal Statistical Society. Series A, Vol. 156, No. 3, 379-392.
Neyman, J. (1952) Lectures and Conferences on Mathematical Statistics and Probability, 2nd edn, pp. 143-154. Washington DC: US Department of Agriculture.
data(stork) pairs(stork[,-1], panel=panel.smooth) ## maybe str(stork) ; plot(stork) ...
data(stork) pairs(stork[,-1], panel=panel.smooth) ## maybe str(stork) ; plot(stork) ...
Subplot will embed a new plot within an existing plot at the coordinates specified (in user units of the existing plot).
subplot(fun, x, y, size=c(1,1), vadj=0.5, hadj=0.5, inset=c(0,0), type=c('plt','fig'), pars=NULL)
subplot(fun, x, y, size=c(1,1), vadj=0.5, hadj=0.5, inset=c(0,0), type=c('plt','fig'), pars=NULL)
fun |
an expression defining the new plot to be embedded. |
x |
|
y |
|
size |
The size of the embedded plot in inches if |
vadj |
vertical adjustment of the plot when |
hadj |
horizontal adjustment of the plot when |
inset |
1 or 2 numbers representing the proportion of the plot to inset the subplot from edges when x is a character string. The first element is the horizontal inset, the second is the vertical inset. |
type |
Character string, if 'plt' then the plotting region is
defined by |
pars |
a list of parameters to be passed to |
The coordinates x
and y
can be scalars or vectors of
length 2. If vectors of length 2 then they determine the opposite
corners of the rectangle for the embedded plot (and the parameters
size
, vadj
, and hadj
are all ignored).
If x
and y
are given as scalars then the plot position
relative to the point and the size of the plot will be determined by
the arguments size
, vadj
, and hadj
. The default
is to center a 1 inch by 1 inch plot at x,y
. Setting
vadj
and hadj
to (0,0)
will position the lower
left corner of the plot at (x,y)
.
If x
is a character string, then it will be parsed for the
strings "left", "right", "top", and "bottom" and x and y will be set
appropriately (anything not specified will be set at the center in
that dimension) using also the inset
argument. This allows the
position of the subplot to be specified as 'topleft' or 'bottom', etc.
The inset
argument is in proportion of the plot units, so 0.1
means inset 10% of the width/height of the plotting distance. If
hadj
/vadj
are not specified, they will be set appropriately.
The rectangle defined by x
, y
, size
, vadj
,
and hadj
will be used as the plotting area of the new plot.
Any tick marks, axis labels, main and sub titles will be outside of
this rectangle if type
is 'plt'. If type is 'fig' then the
annotations will be inside the box.
Any graphical parameter settings that you would like to be in place
before fun
is evaluated can be specified in the pars
argument (warning: specifying layout parameters here (plt
,
mfrow
, etc.) may cause unexpected results).
After the function completes the graphical parameters will have been reset to what they were before calling the function (so you can continue to augment the original plot).
An invisible list with the graphical parameters that were in effect
when the subplot was created. Passing this list to par
will
enable you to augment the embedded plot.
Greg Snow [email protected]
grconvertX
, par
,
symbols
, my.symbols
, ms.image
# make an original plot plot( 11:20, sample(51:60) ) # add some histograms subplot( hist(rnorm(100)), 15, 55) subplot( hist(runif(100),main='',xlab='',ylab=''), 11, 51, hadj=0, vadj=0) subplot( hist(rexp(100, 1/3)), 20, 60, hadj=1, vadj=1, size=c(0.5,2) ) subplot( hist(rt(100,3)), c(12,16), c(57,59), pars=list(lwd=3,ask=FALSE) ) ### some of the following examples work fine in an interactive session, ### but loading the packages required does not work well in testing. # augment a map if( interactive() && require(spData) ){ plot(state.vbm,fg=NULL) tmp <- cbind( state.vbm$center_x, state.vbm$center_y ) for( i in 1:50 ){ tmp2 <- as.matrix(USArrests[i,c(1,4)]) tmp3 <- max(USArrests[,c(1,4)]) subplot( barplot(tmp2, ylim=c(0,tmp3),names=c('',''),yaxt='n'), x=tmp[i,1], y=tmp[i,2], size=c(.1,.1)) } } tmp <- rnorm(25) qqnorm(tmp) qqline(tmp) tmp2 <- subplot( hist(tmp,xlab='',ylab='',main=''), grconvertX(0.1,from='npc'), grconvertY(0.9,from='npc'), vadj=1, hadj=0 ) abline(v=0, col='red') # wrong way to add a reference line to histogram # right way to add a reference line to histogram op <- par(no.readonly=TRUE) par(tmp2) abline(v=0, col='green') par(op) # scatter-plot using images if(interactive() && require(png)) { image.png <- function(x,...) { cols <- rgb( x[,,1], x[,,2], x[,,3], x[,,4] ) z <- 1:length(cols) dim(z) <- dim(x[,,1]) z <- t(z) z <- z[ ,rev(seq_len(ncol(z))) ] image(z, col=cols, axes=FALSE, ...) } logo <- readPNG(system.file("img", "Rlogo.png", package="png")) x <- runif(10) y <- runif(10) plot(x,y, type='n') for(i in 1:10) { subplot(image.png(logo), x[i], y[i], size=c(0.3,0.3)) } }
# make an original plot plot( 11:20, sample(51:60) ) # add some histograms subplot( hist(rnorm(100)), 15, 55) subplot( hist(runif(100),main='',xlab='',ylab=''), 11, 51, hadj=0, vadj=0) subplot( hist(rexp(100, 1/3)), 20, 60, hadj=1, vadj=1, size=c(0.5,2) ) subplot( hist(rt(100,3)), c(12,16), c(57,59), pars=list(lwd=3,ask=FALSE) ) ### some of the following examples work fine in an interactive session, ### but loading the packages required does not work well in testing. # augment a map if( interactive() && require(spData) ){ plot(state.vbm,fg=NULL) tmp <- cbind( state.vbm$center_x, state.vbm$center_y ) for( i in 1:50 ){ tmp2 <- as.matrix(USArrests[i,c(1,4)]) tmp3 <- max(USArrests[,c(1,4)]) subplot( barplot(tmp2, ylim=c(0,tmp3),names=c('',''),yaxt='n'), x=tmp[i,1], y=tmp[i,2], size=c(.1,.1)) } } tmp <- rnorm(25) qqnorm(tmp) qqline(tmp) tmp2 <- subplot( hist(tmp,xlab='',ylab='',main=''), grconvertX(0.1,from='npc'), grconvertY(0.9,from='npc'), vadj=1, hadj=0 ) abline(v=0, col='red') # wrong way to add a reference line to histogram # right way to add a reference line to histogram op <- par(no.readonly=TRUE) par(tmp2) abline(v=0, col='green') par(op) # scatter-plot using images if(interactive() && require(png)) { image.png <- function(x,...) { cols <- rgb( x[,,1], x[,,2], x[,,3], x[,,4] ) z <- 1:length(cols) dim(z) <- dim(x[,,1]) z <- t(z) z <- z[ ,rev(seq_len(ncol(z))) ] image(z, col=cols, axes=FALSE, ...) } logo <- readPNG(system.file("img", "Rlogo.png", package="png")) x <- runif(10) y <- runif(10) plot(x,y, type='n') for(i in 1:10) { subplot(image.png(logo), x[i], y[i], size=c(0.3,0.3)) } }
This function plots a dataset in a Tk window then places 3 lines on the plot which show a predicted y value for the given x value. The lines can be clicked on and dragged to new x-values with the predicted y-values automatically updating. A table at the bottom of the graph shows the differences between the pairs of x-values and y-values.
TkApprox(x, y, type = "b", snap.to.x = FALSE, digits = 4, cols = c("red", "#009900", "blue"), xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), hscale = 1.5, vscale = 1.5, wait = TRUE, ...)
TkApprox(x, y, type = "b", snap.to.x = FALSE, digits = 4, cols = c("red", "#009900", "blue"), xlab = deparse(substitute(x)), ylab = deparse(substitute(y)), hscale = 1.5, vscale = 1.5, wait = TRUE, ...)
x |
The x-values of the data, should be sorted |
y |
The corresponding y-values of the data |
type |
Type of plot (lines, points, both) passed to |
snap.to.x |
If True then the lines will snap to x-values (can be changed with a checkbox in the Tk window) |
digits |
Number of significant digits to display (passed to
|
cols |
Vector of 3 colors, used for the reference lines |
xlab |
Label for x-axis |
ylab |
Label for y-axis |
hscale |
Horizontal Scale of the plot, passed to |
vscale |
Vertical Scale of the plot, passed to |
wait |
Should R wait for the window to be closed |
... |
Additional parameters passed to |
This provides an interactive way to explore predictions from a set of
x and y values. Internally the function approxfun
is used to
make the predictions.
The x-value of the 3 reference lines can be changed by clicking and dragging the line to a new position. The x and y values are shown in the margins of the graph. Below the graph is a table with the differences (absolute value) between the pairs of points.
This can be used to find peaks/valleys in trends and to see how they differ from starting points, other peaks/valleys, etc..
If wait
is FALSE then an invisible NULL is returned, if
wait
is TRUE then an invisible list with the x and y values of
the 3 reference lines is returned.
Greg Snow [email protected]
if(interactive()) { with(ccc, TkApprox(Time2,Elevation)) }
if(interactive()) { with(ccc, TkApprox(Time2,Elevation)) }
Creates a Tk window with a scatterplot matrix, then allows you to "brush" the points to change their color and/or style.
tkBrush(mat,hscale=1.75,vscale=1.75,wait=TRUE,...)
tkBrush(mat,hscale=1.75,vscale=1.75,wait=TRUE,...)
mat |
A matrix of the data to plot, columns are variables, rows
are observations, same as |
hscale |
Passed to |
vscale |
Passed to |
wait |
Should the function wait for you to finish, see below |
... |
Additional arguments passed to the panel functions |
This function creates a Tk window with a pairs plot of mat
,
then allows you to interactively move a rectangle (the brush) over the
points to change their color and plotting character.
The arrow keys can be used to change the size and shape of the brush. The left arrow makes the rectangle wider, the right makes it narrower. The up arrow key makes it taller, the right makes it shorter.
When the mouse button is not pressed the points inside the brush will change while in the brush, but return to their previous state when the brush moves off them. If the mouse button is pressed then the points inside the brush will be changed and the change will remain until a different set of conditions is brushed on them.
The style of the brushed points is determined by the values of the 2
entry boxes on the right side of the plot. You can specify the
plotting character in the pch
box, this can be anything that
you would regularly pass to the pch
argument of points
,
e.g. an integer or single character. You can specify the color of the
brushed points using the color
entry box, specify the name of
any color recognized by R (see colors
), if this box does not
contain a legal color name then black will be used.
If wait
is FALSE then the Tk window will exist independently of
R and you can continue to do other things in the R window, in this
case the function returns NULL. If wait
is TRUE then R waits
for you to close the Tk window (using the quit button) then returns a
list with the colors and plotting characters resulting from your
brushing, this information can be used to recreate the plot using
pairs
on a new graphics device (for printing or saving).
Either NULL (if Wait=FALSE) or a list with components col
and
pch
corresponding to the state of the points.
Greg Snow [email protected]
pairs
,colors
,points
,
the iplots
package
if(interactive()){ # Iris dataset out1 <- tkBrush(iris) # Now brush the points pairs(iris, col=out1$col, pch=out1$pch) # or colhist <- function(x,...){ tmp <- hist(x,plot=F) br <- tmp$breaks w <- as.numeric(cut(x,br,include.lowest=TRUE)) sy <- unlist(lapply(tmp$counts,function(x)seq(length=x))) my <- max(sy) sy <- sy/my my <- 1/my sy <- sy[order(order(x))] tmp.usr <- par('usr'); on.exit(par(usr=tmp.usr)) par(usr=c(tmp.usr[1:2],0,1.5)) rect(br[w], sy-my, br[w+1], sy, col=out1$col, # note out1$col is hardcoded here. border=NA) rect(br[-length(br)], 0, br[-1], tmp$counts*my) } pairs(iris, col=out1$col, pch=out1$pch, diag.panel=colhist) # some spheres s1 <- matrix(nrow=0,ncol=3) while( nrow(s1) < 1000 ){ tmp <- rnorm(3) if( sum(tmp^2) <= 1 ){ s1 <- rbind(s1,tmp) } } s2 <- matrix(rnorm(3000), ncol=3) s2 <- s2/apply(s2,1,function(x) sqrt(sum(x^2))) tkBrush(s1, wait=FALSE) tkBrush(s2, wait=FALSE) # now paint values where var 2 is close to 0 in both plots # and compare the var 1 and var 3 relationship }
if(interactive()){ # Iris dataset out1 <- tkBrush(iris) # Now brush the points pairs(iris, col=out1$col, pch=out1$pch) # or colhist <- function(x,...){ tmp <- hist(x,plot=F) br <- tmp$breaks w <- as.numeric(cut(x,br,include.lowest=TRUE)) sy <- unlist(lapply(tmp$counts,function(x)seq(length=x))) my <- max(sy) sy <- sy/my my <- 1/my sy <- sy[order(order(x))] tmp.usr <- par('usr'); on.exit(par(usr=tmp.usr)) par(usr=c(tmp.usr[1:2],0,1.5)) rect(br[w], sy-my, br[w+1], sy, col=out1$col, # note out1$col is hardcoded here. border=NA) rect(br[-length(br)], 0, br[-1], tmp$counts*my) } pairs(iris, col=out1$col, pch=out1$pch, diag.panel=colhist) # some spheres s1 <- matrix(nrow=0,ncol=3) while( nrow(s1) < 1000 ){ tmp <- rnorm(3) if( sum(tmp^2) <= 1 ){ s1 <- rbind(s1,tmp) } } s2 <- matrix(rnorm(3000), ncol=3) s2 <- s2/apply(s2,1,function(x) sqrt(sum(x^2))) tkBrush(s1, wait=FALSE) tkBrush(s2, wait=FALSE) # now paint values where var 2 is close to 0 in both plots # and compare the var 1 and var 3 relationship }
Build a probability distribution (one option for creating a prior distribution) by clicking or dragging a plot.
TkBuildDist(x = seq(min + (max - min)/nbin/2, max - (max - min)/nbin/2, length.out = nbin), min = 0, max = 10, nbin = 10, logspline = TRUE, intervals = FALSE) TkBuildDist2( min=0, max=1, nbin=10, logspline=TRUE)
TkBuildDist(x = seq(min + (max - min)/nbin/2, max - (max - min)/nbin/2, length.out = nbin), min = 0, max = 10, nbin = 10, logspline = TRUE, intervals = FALSE) TkBuildDist2( min=0, max=1, nbin=10, logspline=TRUE)
x |
A starting set of data points, will default to a sequence of uniform values. |
min |
The minimum value for the histogram |
max |
The maximum value for the histogram |
nbin |
The number of bins for the histogram |
logspline |
Logical, whether to include a logspline curve on the plot and in the output. |
intervals |
Logical, should the logspline fit be based on the interval counts rather than the clicked data points, also should the interval summary be returned. |
Bothe of these functions will open a Tk window to interact with. The window will show a histogram (the defaults will show a uniform distribution), optionally a logspline fit line will be included as well. Including the logspline will slow things down a bit, so you may want to skip it on slow computers.
If you use the TkBuildDist
function then a left click on the
histogram will add an additional point to the histogram bar clicked on
(the actual x-value where clicked will be saved, returned, and used in
the optional logspline unless intervals
is TRUE). Right
clicking on the histogram will remove the point closest to where
clicked (based only on x), which will usually have the effect of
decreasing the clicked bar by 1, but could affect the neigboring bar
if you click near the edge or click on a bar that is 0.
If you use the TkBuildDist2
function then the individual bars
can be adjusted by clicking at the top of a bar and dragging up or
down, or clicking at what you want the new height of the bar to be.
As the current bar is adjusted the other bars will adjust in the
oposite direction proportional to their current heights.
The logspline fit assumes the basis for the distribution is the real
line, the min
and max
arguments only control the
histogram and where values can be changed.
Both functions return a list with the breaks that were used the
logspline fit (if logspline
is TRUE), the x-values clicked on
(for TkBuildDist
), and the proportion of the distribution
within each interval (for TkBuildDist2
or if intervals
is TRUE).
Greg Snow [email protected]
The logspline package
if(interactive()){ tmp1 <- TkBuildDist() tmp2 <- TkBuildDist2() }
if(interactive()){ tmp1 <- TkBuildDist() tmp2 <- TkBuildDist2() }
This utility will create a Tk window with a graph and controls to change the parameters of the plotting function interactively.
tkexamp(FUN, param.list, vscale=1.5, hscale=1.5, wait=FALSE, plotloc='top', an.play=TRUE, print=FALSE, ...)
tkexamp(FUN, param.list, vscale=1.5, hscale=1.5, wait=FALSE, plotloc='top', an.play=TRUE, print=FALSE, ...)
FUN |
A function call to create the example plot |
param.list |
A list of lists with information on the parameters to control and the controls to use. See Details Below |
vscale |
Vertical size of plot, passed to |
hscale |
Horizontal size of plot, passed to |
wait |
Should R wait for the demo to end |
plotloc |
Character with "top", "left", or "right" indicating where the plot should be placed relative to the controls |
an.play |
Should the scheduling in tcltk2 package be used for animations |
print |
Automatically print the result (useful for ggplot2/lattice) |
... |
Extra arguments, currently ignored |
This is a helper function to create interactive demonstrations of the effect of various function arguments on the resulting graph.
The FUN
argument should be a function call to create the basic
plot (if run stand alone this should create the starting plot). The
arguments to be changed should not be included.
The param.list
is a nested list of lists that defines which
controls to use for which function arguments. Additional levels of
nested lists creates groups of controls (see examples below) and if
the list is named in the enclosing list, that name will be used to
label the group.
The
lowest level of lists control a single function argument with the
control to be used. The name of the list in the enclosing list is the
name of the function argument to be used, e.g. "pch=list(...)"
will create a control for the pch
parameter.
The first element of the innermost list is a character string
specifying which control to use (from the list below), the rest of the
elements must be named and specify parameters of the controls. For
details on all possible parameters see the tcltk documentation. Any
parameter can be set using this list, for example most controls have a
width
parameter that can be set with code like width=5
.
Most controls also have an init
argument that specifies the
initial value that the control will be set to (most have a default in
case you don't specify the value).
The following are the possible controls you can specify as the first element of the list along with the most common parameters to specify:
"numentry", an entry box where a number can be typed in, this will be
passed to FUN
wrapped in as.numeric()
.
"entry", an entry box where a character string can be typed in (this
will be passed to FUN
as a character string, not converted).
"slider", a slider (or scale) that can be dragged left and right to choose the different values. The common parameters to specify are "from" (the lowest value), "to" (the largest value), and "resolution" (the increment size when sliding).
"vslider", just like slider except that the slider is dragged up and down rather than left and right.
"spinbox", an entry widget for a number with small arrows on the right
side that can be used to increment/decrement the value, or you can
type in a value. The common parameters to set are "from" (smallest
value), "to" (largest value), and "increment" (how much to change the
value by when clicking on the arrows). You can also set "values"
which is a vector of values that can be used. This will be passed to
FUN
as a number.
"checkbox", a box that can be checked, passed to FUN
as a
logical (TRUE if checked, FALSE if not checked). To set the intial
value as TRUE (the default is FALSE) use init='T'
.
"combobox", an entry widget with an arrow on the right side that will
bring up a list of values to choose from. This value is passed to
FUN
as a character string. The important parameter to set is
"values" which is a vector of character strings to choose between.
This option will only work with tcl version 8.5 or later and will
probably produce an error in earlier versions.
"radiobuttons", a set of choices with check boxes next to each, when one is selected the previous selection is cleared. The important parameter to set is "values" wich is a vector of character strings to choose between.
"animate", is a combination of a slider and a button. If the tcltk2
package is avaliable and an.play=TRUE
then the button will say
"Play" and pressing the button will automatically increment the slider
(and update the graph) until it reaches the maximum value. Otherwise
the button will say "Inc" and you must click and hold on the button to
run the animation (this might be prefered in that you can stop the
animation). Either way you can set the delay option (all other
options match with the slider option) and move the slider when the
interaction is not happening. The animation starts at the current
value on the slider and goes to the maximum value. You should only
have at most one animation control (multiple will confuse each other),
this includes not having multiple windows operating at the same time
with animation controls.
Each nesting of lists will also change how the controls are placed (top to bottom vs. left to right).
The Tk window will also have a default set of controls at the bottom.
These include entry widgets for vscale
and hscale
for
changing the size of the graph (initially set by arguments to
tkexamp
). A "Refresh" button that will refresh the graph with
the new parameter values (some controls like sliders will
automatically refresh, but others like entries will not refresh on
their own and you will need to click on this button to see the
updates). A "Print Call" button that when clicked will print a text
string to the R terminal that represents the function call with the
current argument settings (copying and pasting this to the command
line should recreate the current plot on the current plotting
device). And an "Exit" button that will end the demo and close the window.
If wait
is FALSE then it returns an invisible NULL, if
wait
is TRUE then it returns a list with the argument values
when the window was closed.
You can move the sliders in 3 different ways: You can left click and drag the slider itself, you can left click in the trough to either side of the slider and the slider will move 1 unit in the direction you clicked, or you can right click in the trough and the slider will jump to the location you clicked at.
Greg Snow, [email protected]
tkrplot
, the fgui package, the playwith package, and
the rpanel package
if(interactive()) { x <- sort( runif(25,1,10) ) y <- rnorm(25, x) # some common plotting parameters tke.test1 <- list(Parameters=list( pch=list('spinbox',init=1,from=0,to=255,width=5), cex=list('slider',init=1.5,from=0.1,to=5,resolution=0.1), type=list('combobox',init='b', values=c('p','l','b','o','c','h','s','S','n'), width=5), lwd=list('spinbox',init=1,from=0,to=5,increment=1,width=5), lty=list('spinbox',init=1,from=0,to=6,increment=1,width=5) )) tkexamp( plot(x,y), tke.test1, plotloc='top' ) # different controls for the parameters tke.test2 <- list(Parameters=list( pch=list('spinbox',init=1,values=c(0:25,32:255),width=5), cex=list('slider',init=1.5,from=0.1,to=5,resolution=0.1), type=list('radiobuttons',init='b', values=c('p','l','b','o','c','h','s','S','n'), width=5), lwd=list('spinbox',init=1,from=0,to=5,increment=1,width=5), lty=list('spinbox',init=1,from=0,to=6,increment=1,width=5), xpd=list('checkbox') )) tkexamp( plot(x,y), tke.test2, plotloc='left') tmp <- tkexamp( plot(x,y), list(tke.test2), plotloc='right', wait=TRUE ) # now recreate the plot tmp$x <- x tmp$xlab <- 'x' tmp$y <- y tmp$ylab <- 'y' do.call('plot', tmp) # a non plotting example tke.test3 <- list( sens=list('slider', init=0.95, from=0.9, to=1, resolution=0.005), spec=list('slider', init=0.9, from=0.8, to=1, resolution=0.005), prev=list('slider', init=0.01, from=0.0001, to=0.1, resolution=0.0001), step=list('spinbox', init=1, from=1, to=11, width=5), n=list('numentry',init=100000, width=7) ) options(scipen=1) tkexamp( SensSpec.demo(), tke.test3 ) # now increment step and watch the console # Above example but converting it to plot tempfun <- function(sens,spec,prev,step,n) { if(missing(sens) || missing(n)) return(invisible(NULL)) tmp <- capture.output( SensSpec.demo(sens=sens,spec=spec, prev=prev, n=n, step=step) ) par(cex=2.25) plot.new() tmp2 <- strheight(tmp) text(0, 1-cumsum(tmp2*1.5), tmp, family='mono', adj=0) title('Sensitivity and Specificity Example') } tkexamp( tempfun(), tke.test3, hscale=4, vscale=2 ) # an example using trellis graphics tke.test4 <- list( alpha=list('slider', from=0,to=1,init=1, resolution=0.05), cex=list('spinbox',init=.8,from=.1,to=3,increment=.1,width=5), col=list('entry',init='#0080ff'), pch=list('spinbox',init=1, from=0,to=255, increment=1,width=5), fill=list('entry',init='transparent') ) tempfun <- function(x,y,alpha,cex,col,pch,fill) { if(missing(alpha) || missing(cex)) {return()} trellis.par.set(plot.symbol=list(alpha=alpha, cex=cex, col=col, font=1,pch=pch,fill=fill)) print(xyplot( y~x )) } require(lattice) tkexamp( tempfun(x,y), list(tke.test4), plotloc='left') # Two example using ggplot2 if( require(ggplot2) ) { ## 1 tkexamp( qplot(cty,data=mpg, geom='histogram'), list(binwidth=list('slider',from=1,to=25)), print=TRUE) ## 2 tmpfun <- function(bw=2){ print(ggplot(mpg, aes(cty)) + geom_histogram(binwidth = bw)) } tkexamp( tmpfun, list(bw=list('slider',from=1, to=5))) } }
if(interactive()) { x <- sort( runif(25,1,10) ) y <- rnorm(25, x) # some common plotting parameters tke.test1 <- list(Parameters=list( pch=list('spinbox',init=1,from=0,to=255,width=5), cex=list('slider',init=1.5,from=0.1,to=5,resolution=0.1), type=list('combobox',init='b', values=c('p','l','b','o','c','h','s','S','n'), width=5), lwd=list('spinbox',init=1,from=0,to=5,increment=1,width=5), lty=list('spinbox',init=1,from=0,to=6,increment=1,width=5) )) tkexamp( plot(x,y), tke.test1, plotloc='top' ) # different controls for the parameters tke.test2 <- list(Parameters=list( pch=list('spinbox',init=1,values=c(0:25,32:255),width=5), cex=list('slider',init=1.5,from=0.1,to=5,resolution=0.1), type=list('radiobuttons',init='b', values=c('p','l','b','o','c','h','s','S','n'), width=5), lwd=list('spinbox',init=1,from=0,to=5,increment=1,width=5), lty=list('spinbox',init=1,from=0,to=6,increment=1,width=5), xpd=list('checkbox') )) tkexamp( plot(x,y), tke.test2, plotloc='left') tmp <- tkexamp( plot(x,y), list(tke.test2), plotloc='right', wait=TRUE ) # now recreate the plot tmp$x <- x tmp$xlab <- 'x' tmp$y <- y tmp$ylab <- 'y' do.call('plot', tmp) # a non plotting example tke.test3 <- list( sens=list('slider', init=0.95, from=0.9, to=1, resolution=0.005), spec=list('slider', init=0.9, from=0.8, to=1, resolution=0.005), prev=list('slider', init=0.01, from=0.0001, to=0.1, resolution=0.0001), step=list('spinbox', init=1, from=1, to=11, width=5), n=list('numentry',init=100000, width=7) ) options(scipen=1) tkexamp( SensSpec.demo(), tke.test3 ) # now increment step and watch the console # Above example but converting it to plot tempfun <- function(sens,spec,prev,step,n) { if(missing(sens) || missing(n)) return(invisible(NULL)) tmp <- capture.output( SensSpec.demo(sens=sens,spec=spec, prev=prev, n=n, step=step) ) par(cex=2.25) plot.new() tmp2 <- strheight(tmp) text(0, 1-cumsum(tmp2*1.5), tmp, family='mono', adj=0) title('Sensitivity and Specificity Example') } tkexamp( tempfun(), tke.test3, hscale=4, vscale=2 ) # an example using trellis graphics tke.test4 <- list( alpha=list('slider', from=0,to=1,init=1, resolution=0.05), cex=list('spinbox',init=.8,from=.1,to=3,increment=.1,width=5), col=list('entry',init='#0080ff'), pch=list('spinbox',init=1, from=0,to=255, increment=1,width=5), fill=list('entry',init='transparent') ) tempfun <- function(x,y,alpha,cex,col,pch,fill) { if(missing(alpha) || missing(cex)) {return()} trellis.par.set(plot.symbol=list(alpha=alpha, cex=cex, col=col, font=1,pch=pch,fill=fill)) print(xyplot( y~x )) } require(lattice) tkexamp( tempfun(x,y), list(tke.test4), plotloc='left') # Two example using ggplot2 if( require(ggplot2) ) { ## 1 tkexamp( qplot(cty,data=mpg, geom='histogram'), list(binwidth=list('slider',from=1,to=25)), print=TRUE) ## 2 tmpfun <- function(bw=2){ print(ggplot(mpg, aes(cty)) + geom_histogram(binwidth = bw)) } tkexamp( tmpfun, list(bw=list('slider',from=1, to=5))) } }
This is somewhat like the str
function, except that it creates
a new Tk window and a tree object representing the list or object.
You can then click on the '+' signs to expand branches of the list to
see what they contain.
TkListView(list)
TkListView(list)
list |
The list or object to be viewed. |
This function opens a Tk window with a tree view of the list in the
leftmost pane. Next to the tree is the result from the str
function for each element of the list. Clicking on the '+' symbol
next to list elements will expand the tree branch to show what that
list/sublist contains. On the right is an output pane with 3 buttons
below it. These can be used by first selecting (clicking on) a list
element in the left pane (this can be a whole list or single
element), then clicking on one of the buttons. The output from the
button appears in the right pane (replacing anything that may have
been there before). The 'print' button just prints the element/sublist
selected. The 'str' button calls the str
function on the
selected element/list/sublist. The 'Eval:' button will evaluate the
code in the entry box next to it with the selected element of the list
being the 'x' variable. For example you could click on an element in
the list that is a numeric vector, type 'hist(x)' in the entry box,
and click on the 'Eval:' button to produce a histogram
(current/default R graphics device) of the data in that element.
any lists/objects with attributes will show the attributes as an additional branch in the tree with a label of "<<attributes>>".
This function works on S3 objects that are stored as lists. Since currently S4 objects are saved as attributes, wrapping them in a list will work with this function to view their structure, see the example below.
This function is ran for its side effects, it does not return anything of use.
Greg Snow, [email protected]
if(interactive()) { tmp <- list( a=letters, b=list(1:10, 10:1), c=list( x=rnorm(100), z=data.frame(x=rnorm(10),y=rnorm(10)))) TkListView(tmp) fit <- lm(Petal.Width ~ ., data=iris) TkListView(fit) if(require(stats4)){ # this example is copied almost verbatim from ?mle x <- 0:10 y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8) ll <- function(ymax=15, xhalf=6) -sum(stats::dpois(y, lambda=ymax/(1+x/xhalf), log=TRUE)) (fit <- mle(ll)) TkListView(list(fit)) } }
if(interactive()) { tmp <- list( a=letters, b=list(1:10, 10:1), c=list( x=rnorm(100), z=data.frame(x=rnorm(10),y=rnorm(10)))) TkListView(tmp) fit <- lm(Petal.Width ~ ., data=iris) TkListView(fit) if(require(stats4)){ # this example is copied almost verbatim from ?mle x <- 0:10 y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8) ll <- function(ymax=15, xhalf=6) -sum(stats::dpois(y, lambda=ymax/(1+x/xhalf), log=TRUE)) (fit <- mle(ll)) TkListView(list(fit)) } }
These functions create a plot of predicted values vs. one of the predictors for given values of the other predictors. TkPredict further creates a Tk gui to allow you to change the values of the other predictors.
Predict.Plot(model, pred.var, ..., type='response', add=FALSE, plot.args=list(), n.points=100, ref.val, ref.col='green', ref.lty=1, data) TkPredict(model, data, pred.var, ...)
Predict.Plot(model, pred.var, ..., type='response', add=FALSE, plot.args=list(), n.points=100, ref.val, ref.col='green', ref.lty=1, data) TkPredict(model, data, pred.var, ...)
model |
A model of class 'lm' or 'glm' (or possibly others) from which to plot predictions. |
pred.var |
A character string indicating which predictor variable to put on the x-axis of the plot. |
... |
for |
type |
The type value passed on to the predict function. |
add |
Whether to add a line to the existing plot or start a new plot. |
plot.args |
A list of additional options passed on to the plotting function. |
n.points |
The number of points to use in the approximation of the curve. |
ref.val |
A reference value for the |
ref.col , ref.lty
|
The color and line type of the reference line if plotted. |
data |
The data frame or environment where the variables that the model was fit to are found. If missing, the model will be examined for an attempt find the needed data. |
These functions plot the predicted values from a regression model
(lm
or glm
) against one of the predictor variables for
given values of the other predictors. The values of the other
predictors are passed as the ...
argument to
Predict.Plot
or are set using gui controls in TkPredict
(initial values are the medians).
If the variable for the x axis (name put in pred.var
) is not
included with the ...
variables, then the range will be
computed from the data
argument or the data component of the
model
argument.
If the variable passed as pred.var
is also included in the
...
arguments and contains a single value, then this value will
be used as the ref.val
argument.
If it contains 2 or more values, then the range of these values will be used as the x-limits for the predictions.
When running TkPredict
you can click on the "Print Call" button
to print out the call of Predict.Plot
that will recreate the
same plot. Doing this for different combinations of predictor values
and editing the plot.args
and add
arguments will give
you a script that will create a static version of the predictions.
These functions are run for their side effects of creating plots and do not return anything.
The GUI currently allows you to select a factor as the x-variable. If you do this it will generate some errors and you will not see the plot, just choose a different variable as the x-variable and the plot will return.
Greg Snow, [email protected]
library(splines) fit.lm1 <- lm( Sepal.Width ~ ns(Petal.Width,3)*ns(Petal.Length,3)+Species, data=iris) Predict.Plot(fit.lm1, pred.var = "Petal.Width", Petal.Width = 1.22, Petal.Length = 4.3, Species = "versicolor", plot.args = list(ylim=range(iris$Sepal.Width), col='blue'), type = "response") Predict.Plot(fit.lm1, pred.var = "Petal.Width", Petal.Width = 1.22, Petal.Length = 4.3, Species = "virginica", plot.args = list(col='red'), type = "response", add=TRUE) Predict.Plot(fit.lm1, pred.var = "Petal.Width", Petal.Width = 1.22, Petal.Length = 4.4, Species = "virginica", plot.args = list(col='purple'), type = "response", add=TRUE) fit.glm1 <- glm( Species=='virginica' ~ Sepal.Width+Sepal.Length, data=iris, family=binomial) Predict.Plot(fit.glm1, pred.var = "Sepal.Length", Sepal.Width = 1.99, Sepal.Length = 6.34, plot.args = list(ylim=c(0,1), col='blue'), type = "response") Predict.Plot(fit.glm1, pred.var = "Sepal.Length", Sepal.Width = 4.39, Sepal.Length = 6.34, plot.args = list(col='red'), type = "response", add=TRUE) if(interactive()){ TkPredict(fit.lm1) TkPredict(fit.glm1) }
library(splines) fit.lm1 <- lm( Sepal.Width ~ ns(Petal.Width,3)*ns(Petal.Length,3)+Species, data=iris) Predict.Plot(fit.lm1, pred.var = "Petal.Width", Petal.Width = 1.22, Petal.Length = 4.3, Species = "versicolor", plot.args = list(ylim=range(iris$Sepal.Width), col='blue'), type = "response") Predict.Plot(fit.lm1, pred.var = "Petal.Width", Petal.Width = 1.22, Petal.Length = 4.3, Species = "virginica", plot.args = list(col='red'), type = "response", add=TRUE) Predict.Plot(fit.lm1, pred.var = "Petal.Width", Petal.Width = 1.22, Petal.Length = 4.4, Species = "virginica", plot.args = list(col='purple'), type = "response", add=TRUE) fit.glm1 <- glm( Species=='virginica' ~ Sepal.Width+Sepal.Length, data=iris, family=binomial) Predict.Plot(fit.glm1, pred.var = "Sepal.Length", Sepal.Width = 1.99, Sepal.Length = 6.34, plot.args = list(ylim=c(0,1), col='blue'), type = "response") Predict.Plot(fit.glm1, pred.var = "Sepal.Length", Sepal.Width = 4.39, Sepal.Length = 6.34, plot.args = list(col='red'), type = "response", add=TRUE) if(interactive()){ TkPredict(fit.lm1) TkPredict(fit.glm1) }
This function plots a dataset in a Tk window then draws the spline fit through the points. It places a line to show the predicted y from the given x value. The line can be clicked on and dragged to new x-values with the predicted y-values automatically updating. A table at the bottem of the graph shows the values and the 3 derivatives.
TkSpline(x, y, method='natural', snap.to.x=FALSE, digits=4, col=c('blue','#009900','red','black'), xlab=deparse(substitute(x)), ylab=deparse(substitute(y)), hscale=1.5, vscale=1.5, wait=TRUE, ...)
TkSpline(x, y, method='natural', snap.to.x=FALSE, digits=4, col=c('blue','#009900','red','black'), xlab=deparse(substitute(x)), ylab=deparse(substitute(y)), hscale=1.5, vscale=1.5, wait=TRUE, ...)
x |
The x-values of the data, should be sorted |
y |
The corresponding y-values of the data |
method |
Spline Method, passed to |
snap.to.x |
Logical, if TRUE then the line will only take on the
values of |
digits |
Number of digits to print, passed to |
col |
Colors of the prediction and other lines |
xlab |
Label for the x-axis, passed to |
ylab |
Label for the y-axis, passed to |
hscale |
Horizontal scaling, passed to |
vscale |
Vertical scaling, passed to |
wait |
Should R wait for the window to close |
... |
Additional parameters passed to |
This provides an interactive way to explore predictions from a set of
x and y values. Internally the function splinefun
is used to
make the predictions.
The x-value of the reference line can be changed by clicking and dragging the line to a new position. The x and y values are shown in the margins of the graph. Below the graph is a table with the y-value and derivatives.
If wait
is FALSE then an invisible NULL is returned, if
wait
is TRUE then an invisible list with the x and y values and
derivatives is returned.
Greg Snow [email protected]
if(interactive()) { x <- 1:10 y <- sin(x) TkSpline(x,y, xlim=c(0,11)) }
if(interactive()) { x <- 1:10 y <- sin(x) TkSpline(x,y, xlim=c(0,11)) }
These are GPS information from 3 trips.
Data frames with the following variables.
Index
Measurement number
Time
a POSIXt, Time of measurement
Elevation
a numeric vector, Elevation in Feet
Leg.Dist
a character/numeric vector, The distance
traveled in that leg (in feet for ccc
)
Leg.Time
a difftime, the time of that leg
Speed
a numeric vector, Speed in mph
Direction
a numeric vector, Direction in Degrees, 0 is North, 90 is East, 180 is South, 270 is West
LatLon
a character vector, Latitude and Longitude as characters
Leg.Dist.f
a numeric vector, Length of that leg in feet
Leg.Dist.m
a numeric vector, Length of that leg in miles
Lat
a numeric vector, Numeric latitude
Lon
a numeric vector, Numeric longitude (negative for west)
Distance
a numeric vector, Distance from start in feet
Distance.f
a numeric vector, Distance from start in feet
Distance.m
a numeric vector, Distance from start in miles
Time2
a difftime, Time from start
Time3
a difftime, cumsum of Leg.Time
The data frame ccc
came from when I was walking back to my
office from a meeting and decided to take the scenic route and started
the GPS.
The data frame h2h
is a trip from my office to another for a
meeting. The first part is traveling by car, the last part by foot
from the parking lot to the building. Speed is a mixture of
distributions.
The data frame towork
came from driving to work one morning
(the first point is where the GPS got it's first lock, not my house).
The overall trip was mostly NorthWest but with enough North and
NorthEast that a simple average of direction shows SouthEast, good
example for circular stats.
My GPS device
if( interactive() ){ with(ccc, TkApprox(Distance, Elevation)) }
if( interactive() ){ with(ccc, TkApprox(Distance, Elevation)) }
Interactively recursively partition a dataset to demonstrate regression trees.
tree.demo(x, y)
tree.demo(x, y)
x |
The predictor variable. |
y |
The response variable. |
This function first creates a scatterplot of x
and y
and
shows the residual sum of squares from fitting a horizontal line to
the y-values.
Clicking anywhere on the graph will show an updated graph where the data is partitioned into 2 groups based on the x-value where you clicked with a horizontal line fit to each group (including showing the updated residual sum of squares). Clicking again will move the partitioning value based on the new click.
When you have found a good partitioning (reduces the RSS), right click and choose 'stop' and that partition will become fixed. Now you can click to do a second set of partions (breaking the data into 3 groups).
To finish the demo, right click and choose 'stop', then right click again and choose 'stop' again.
A vector with the x-values of the cut points that you selected (sorted).
Greg Snow [email protected]
The rpart and tree packages
if(interactive()){ data('ethanol', package='lattice') print(with(ethanol, tree.demo(E,NOx))) }
if(interactive()){ data('ethanol', package='lattice') print(with(ethanol, tree.demo(E,NOx))) }
Create (or add to) a trilinear plot of 3 proportions that sum to 1.
triplot(x, y = NULL, z = NULL, labels = dimnames(x)[[2]], txt = dimnames(x)[[1]], legend = NULL, legend.split = NULL, inner = TRUE, inner.col = c("lightblue", "pink"), inner.lty = c(2, 3), add = FALSE, main = "", ...)
triplot(x, y = NULL, z = NULL, labels = dimnames(x)[[2]], txt = dimnames(x)[[1]], legend = NULL, legend.split = NULL, inner = TRUE, inner.col = c("lightblue", "pink"), inner.lty = c(2, 3), add = FALSE, main = "", ...)
x |
Vector or matrix of up to 3 columns. |
y |
Vector (if |
z |
Vector (if |
labels |
Labels for the 3 components (printed at corners). |
txt |
Vector of text strings to be plotted instead of points. |
legend |
Labels for the data points |
legend.split |
What proportion of the labels will go on the left. |
inner |
Logical, should the inner reference lines be plotted. |
inner.col |
Colors for the 2 inner triangles. |
inner.lty |
Line types for the 2 inner triangles. |
add |
Add points to existing plot (TRUE), or create a new plot (FALSE). |
main |
Main title for the plot. |
... |
Additional arguments passed on to |
Trilinear plots are useful for visualizing membership in 3 groups by plotting sets of 3 proportions that sum to 1 within each set.
The data can be passed to the function as a matrix with either 2 or 3
columns, or as seperate vectors to x
, y
, and optionaly
z
. If 2 columns are passed in, then they must be between 0 and
1 and the 3rd column will be created by subtracting both from 1. If 3
columns of data are given to the function then each will be divided by
the sum of the 3 columns (they don't need to sum to 1 before being
passed in).
An invisible matrix with 2 columns and the same number of rows as x
corresponding to the points plotted (after transforming to 2
dimensions).
The return matrix can be passed to identify
for labeling of
individual points.
Using type='n'
and add=FALSE
will return the transformed
points without doing any plotting.
Greg Snow [email protected]
Allen, Terry. Using and Interpreting the Trilinear Plot. Chance. 15 (Summer 2002).
triangle.plot
in package ade4,
ternaryplot
in package vcd, tri
in package cwhtool,
soil.texture
and triax.plot
in package plotrix.
triplot(USArrests[c(1,4,2)]) tmp <- triplot(USArrests[c(1,4,2)],txt=NULL) if(interactive()){ identify(tmp, lab=rownames(USArrests) ) } tmp <- rbind( HairEyeColor[,,'Male'], HairEyeColor[,,'Female']) tmp[,3] <- tmp[,3] + tmp[,4] tmp <- tmp[,1:3] triplot(tmp, legend=rep(c('Male','Femal'),each=4), col=rep(c('black','brown','red','yellow'),2))
triplot(USArrests[c(1,4,2)]) tmp <- triplot(USArrests[c(1,4,2)],txt=NULL) if(interactive()){ identify(tmp, lab=rownames(USArrests) ) } tmp <- rbind( HairEyeColor[,,'Male'], HairEyeColor[,,'Female']) tmp[,3] <- tmp[,3] + tmp[,4] tmp <- tmp[,1:3] triplot(tmp, legend=rep(c('Male','Femal'),each=4), col=rep(c('black','brown','red','yellow'),2))
These functions save a transcript of your commands and their output to a script file, possibly for later processing with the "enscript" or "pandoc" program.
They work as a combinations of sink
and history
with a
couple extra bells and whistles.
txtStart(file, commands=TRUE, results=TRUE, append=FALSE, cmdfile, visible.only=TRUE) txtStop() txtComment(txt,cmdtxt) txtSkip(expr) etxtStart(dir = tempfile("etxt"), file = "transcript.txt", commands = TRUE, results = TRUE, append = FALSE, cmdbg = "white", cmdcol = "red", resbg = "white", rescol = "navy", combg = "cyan", comcol = "black", cmdfile, visible.only = TRUE) etxtStop() etxtComment(txt, cmdtxt) etxtSkip(expr) etxtPlot(file=paste(tempfile('plot',R2txt.vars$dir),'.eps',sep=''), width=4, height=4) wdtxtStart(commands=TRUE, results=TRUE, fontsize=9, cmdfile, visible.only=TRUE) wdtxtStop() wdtxtComment(txt,cmdtxt) wdtxtSkip(expr) wdtxtPlot(height=5, width=5, pointsize=10) mdtxtStart(dir=tempfile('mdtxt'), file='transcript.md', commands=TRUE, results=TRUE, append=FALSE, cmdfile, visible.only=TRUE) mdtxtStop() mdtxtComment(txt,cmdtxt) mdtxtSkip(expr) mdtxtPlot(file=tempfile('plot',R2txt.vars$dir,'.png'), width=4, height=4)
txtStart(file, commands=TRUE, results=TRUE, append=FALSE, cmdfile, visible.only=TRUE) txtStop() txtComment(txt,cmdtxt) txtSkip(expr) etxtStart(dir = tempfile("etxt"), file = "transcript.txt", commands = TRUE, results = TRUE, append = FALSE, cmdbg = "white", cmdcol = "red", resbg = "white", rescol = "navy", combg = "cyan", comcol = "black", cmdfile, visible.only = TRUE) etxtStop() etxtComment(txt, cmdtxt) etxtSkip(expr) etxtPlot(file=paste(tempfile('plot',R2txt.vars$dir),'.eps',sep=''), width=4, height=4) wdtxtStart(commands=TRUE, results=TRUE, fontsize=9, cmdfile, visible.only=TRUE) wdtxtStop() wdtxtComment(txt,cmdtxt) wdtxtSkip(expr) wdtxtPlot(height=5, width=5, pointsize=10) mdtxtStart(dir=tempfile('mdtxt'), file='transcript.md', commands=TRUE, results=TRUE, append=FALSE, cmdfile, visible.only=TRUE) mdtxtStop() mdtxtComment(txt,cmdtxt) mdtxtSkip(expr) mdtxtPlot(file=tempfile('plot',R2txt.vars$dir,'.png'), width=4, height=4)
dir |
Directory to store transcript file and any graphics file in |
file |
Text file to save transcript in |
commands |
Logical, should the commands be echoed to the transcript file |
results |
Logical, should the results be saved in the transcript file |
append |
Logical, should we append to |
cmdbg |
Background color for command lines in |
cmdcol |
Color of text for command lines in |
resbg |
Background color for results sections in |
rescol |
Text color of results sections in |
combg |
Background color for comments in |
comcol |
Text color of comments in |
cmdfile |
A filename to store commands such that it can be
|
visible.only |
Should non-printed output be included, not currently implemented. |
txt |
Text of a comment to be inserted into |
cmdtxt |
Text of a comment to be inserted into |
expr |
An expression to be executed without being included in
|
width |
Width of plot, passed to |
height |
Height of plot, passed to |
fontsize |
Size of font to use in MSWord |
pointsize |
passed to |
These functions are used to create transcript/command files of your R session. There are 4 sets of functions, those starting with "txt",those starting with "etxt", and those starting with "wdtxt" and those starting with "mdtxt". The "txt" functions create a plain text transcript while the "etxt" functions create a text file with extra escapes and commands so that it can be post processed with enscript (an external program) to create a postscript file and can include graphics as well. The postscript file can be converted to pdf or other format file.
The "wdtxt" functions will insert the commands and results into a Microsoft Word document.
The "mdtxt" functions create a text file but with MarkDown escapes so that it can be post processed with "pandoc" (an external program) to create other formats such as html, pdf, MS Word documents, etc. If the command starts with the string "pander" or "pandoc" (after optional whitespace) then the results will be inserted directly, without escapes, into the transcript file. This assumes that you are using code from the "pander" package which generates markdown formatted output. This will create nicer looking tables and other output.
If results
is TRUE and commands
is FALSE then the result
is similar to the results of sink
. If commands
is true as well then the transcript file will show both the commands
and results similar to the output on the screen. If both
commands
and results
are FALSE then pretty much
the only thing these functions will accomplish is to waste some
computing time.
If cmdfile
is
specified then an additional file is created with the commands
used (similar to the history
command), this file can be
used with source
or copied and
pasted to the terminal.
The Start functions specify the file/directory to create and
start the transcript, wdtxtStart
will open Word if it is
not already open or create a connection to an open word window.
The prompts are changed to remind you
that the commands/results are being copied to the transcript.
The Stop functions stop the recording and reset the prompts.
The R parser strips comments and does some reformatting so the
transcript file may not match exactly with the terminal
output. Use the txtComment
, etxtComment
,
wdtxtComment
, or mdtxtComment
functions to add a comment. This will show up as a line offset
by whitespace in the transcript file, highlighted in the
etxt version, and the default font in Word.
If cmdtxt
is specified then that line
will be inserted into cmdfile
preceded by a # so it
will be skipped if sourced or copied.
The txtSkip
, etxtSkip
, wdtxtSkip
, and
mdtxtSkip
functions will run the code in expr
but
will not include the commands or results in the transcript file
(this can be used for side computations, or requests for help,
etc.).
The etxtPlot
function calls dev.copy2eps
to
create a copy of the current plot and inserts the proper command
into the transcript file so that the eps file will be included
in the final postscript file after processing.
The wdtxtPlot
function calls wdPlot
to send a
copy of the current graph to MS Word.
The mdtxtPlot
function calls dev.copy
to create a
copy of the current plot as a .png file and inserts the proper
command into the transcript file so that the .png file will be
included when processing with pandoc.
Most of these commands do not return anything of use. The exceptions are:
etxtStop
returns the name of the transcript file (including the
directory path).
txtSkip
, etxtSkip
, wdtxtSkip
, and
mdtxtSkip
return the value of expr
.
These commands do not do any fancy formatting of output, just what you
see in the regular terminal window. If you want more formatted output
then you should look into Sweave
, knitr
, or the R2HTML package.
The MS word functions will insert into the current word document at the location of the cursor. This means that if you look at the document and move the current location to somewhere in the middle (or have another word document open with the location in the middle), when you go back to R, the new transcript will be inserted into the middle of the document. So be careful to position the cursor at the end of the correct document before going back to R. Note that the "wdtxt" functions depend on the "R2wd" package which in turn depends on tools that are not free.
Do not use these functions in combination with R2HTML or sink
.
Only one of these sets of functions will work at a time.
Greg Snow, [email protected]
sink
, history
,
Sweave
, the odfWeave package, the R2HTML package, the
R2wd package, the pander package
## Not run: etxtStart() etxtComment('This is todays transcript') date() x <- rnorm(25) summary(x) stem(x) etxtSkip(?hist) hist(x) etxtPlot() Sys.Date() Sys.time() my.file <- etxtStop() # assumes enscript and ps2pdf are on your path system(paste('enscript -e -B -p transcript.ps ', my.file) ) system('ps2pdf transcript.ps') # if the above commands used mdtxt instead of etxt and the pandoc # program is installed and on your path (and dependent programs) then use: system(paste('pandoc -o transcript.docx ', my.file)) ## End(Not run)
## Not run: etxtStart() etxtComment('This is todays transcript') date() x <- rnorm(25) summary(x) stem(x) etxtSkip(?hist) hist(x) etxtPlot() Sys.Date() Sys.time() my.file <- etxtStop() # assumes enscript and ps2pdf are on your path system(paste('enscript -e -B -p transcript.ps ', my.file) ) system('ps2pdf transcript.ps') # if the above commands used mdtxt instead of etxt and the pandoc # program is installed and on your path (and dependent programs) then use: system(paste('pandoc -o transcript.docx ', my.file)) ## End(Not run)
For a traditional graphics plot this function will update the 'usr' coordinates by transforming a pair of points from the current usr coordinates to those specified.
updateusr(x1, y1 = NULL, x2, y2 = NULL)
updateusr(x1, y1 = NULL, x2, y2 = NULL)
x1 |
The x-coords of 2 points in the current 'usr' coordianates,
or anything that can be passed to |
y1 |
The y-coords of 2 points in the current 'usr' coordinates, or an object representing the points in the new 'usr' coordinates. |
x2 |
The x-coords for the 2 points in the new coordinates. |
y2 |
The y-coords for the 2 points in the new coordinates. |
Sometimes graphs (in the traditional graphing scheme) end up with usr
coordinates different from expected for adding to the plot (for
example barplot
does not center the bars at integers). This
function will take 2 points in the current 'usr' coordinates and the
desired 'usr' coordinates of the 2 points and transform the user
coordinates to make this happen. The updating only shifts and scales
the coordinates, it does not do any rotation or warping transforms.
If x1
and y1
are lists or matricies and x2
and
y2
are not specified, then x1
is taken to be the
coordinates in the current system and y1
is the coordinates in
the new system.
Currently you need to give the function exactly 2 points in each system. The 2 points cannot have the same x values or y values in either system.
An invisible list with the previous 'usr' coordinates from par
.
Currently you need to give coordinates for exactly 2 points without missing values. Future versions of the function will allow missing values or multiple points.
Greg Snow, [email protected]
tmp <- barplot(1:4) updateusr(tmp[1:2], 0:1, 1:2, 0:1) lines(1:4, c(1,3,2,2), lwd=3, type='b',col='red') # update the y-axis to put a reference distribution line in the bottom # quarter tmp <- rnorm(100) hist(tmp) tmp2 <- par('usr') xx <- seq(min(tmp), max(tmp), length.out=250) yy <- dnorm(xx, mean(tmp), sd(tmp)) updateusr( tmp2[1:2], tmp2[3:4], tmp2[1:2], c(0, max(yy)*4) ) lines(xx,yy)
tmp <- barplot(1:4) updateusr(tmp[1:2], 0:1, 1:2, 0:1) lines(1:4, c(1,3,2,2), lwd=3, type='b',col='red') # update the y-axis to put a reference distribution line in the bottom # quarter tmp <- rnorm(100) hist(tmp) tmp2 <- par('usr') xx <- seq(min(tmp), max(tmp), length.out=250) yy <- dnorm(xx, mean(tmp), sd(tmp)) updateusr( tmp2[1:2], tmp2[3:4], tmp2[1:2], c(0, max(yy)*4) ) lines(xx,yy)
This is a 3 dimensional Array of the US crime statistics downloaded from the "Uniform Crime Reporting Statistics" of the US government. It comprises the years 1960 through 2010 for all 50 states, Washington DC, and a total for the country.
data(USCrimes)
data(USCrimes)
The format is: num [1:52, 1:51, 1:19] 3266740 226167 1302161 1786272 15717204 ... - attr(*, "dimnames")=List of 3 ..$ State: chr [1:52] "Alabama" "Alaska" "Arizona" "Arkansas" ... ..$ : chr [1:51] "1960" "1961" "1962" "1963" ... ..$ : chr [1:19] "Population" "ViolentCrimeRate" "MurderRate" "RapeRate" ...
The first dimension is the state, the dimnames match the variable
state.name
with the exception of including "District of Columbia"
in the 9th position (alphabetically) and "United States-Total" in
position 45 (alphabetical).
The second dimension is the year, ranging from 1960 to 2010. If indexing by year, remember to put the year in quotes.
The third dimension is the variable:
Total number of residents
The total of the violent crimes (Murder, Rape, Robbery, Assault) per 100,000 population
The number of Murders and Nonnegligent Manslaughters per 100,000 population
Forcible Rapes per 100,000 population
Robberies per 100,000 population
Aggravated Assults per 100,000
The total of the property crimes (Burglary, Theft, Vehicle Theft) per 100,000 population
Burglaries per 100,000 population
Larceny-Thefts per 100,000 population
Motor Vehicle Thefts per 100,000 population
The total of the violent crimes (Murder, Rape, Robbery, Assault
The number of Murders and Nonnegligent Manslaughters
Forcible Rapes
Robberies
Aggravated Assults
The total of the property crimes (Burglary, Theft, Vehicle Theft)
Burglaries
Larceny-Thefts
Motor Vehicle Thefts
Originally: "https://ucrdatatool.gov/"", but that site does not work any more. Likely source for similar data: "https://www.fbi.gov/how-we-can-help-you/more-fbi-services-and-information/ucr/publications"
data(USCrimes) ## maybe str(USCrimes) # plot time series/sparkline for each state if(require(spData) && interactive()) { data(state.vbm) plot(state.vbm) tmp.x <- state.vbm$center_x tmp.x <- c( tmp.x[1:8], 147, tmp.x[9:43], 83, tmp.x[44:50] ) tmp.y <- state.vbm$center_y tmp.y <- c( tmp.y[1:8], 45, tmp.y[9:43], -18, tmp.y[44:50] ) tmp.r <- range( USCrimes[,,'ViolentCrimeRate'], na.rm=TRUE) for(i in 1:52) { subplot( plot(1960:2010, USCrimes[i,,'ViolentCrimeRate'], ann=FALSE, bty='n', type='l', axes=FALSE), tmp.x[i], tmp.y[i], size=c(0.2,0.2) ) } } ## Gapminder style animation over time if( interactive() ) { x.r <- range( USCrimes[-c(9,45),,'Population'], na.rm=TRUE ) y.r <- range( USCrimes[-c(9,45),,'PropertyCrimeRate'], na.rm=TRUE ) tmpfun <- function(Year=1960, ... ) { y <- as.character(Year) plot( USCrimes[-c(9,45),y,'Population'], USCrimes[-c(9,45),y,'PropertyCrimeRate'], type='n', xlab='log Population', ylab='Property Crime Rate', main=y, xlim=x.r, ylim=y.r, log='x' ) text( USCrimes[-c(9,45),y,'Population'], USCrimes[-c(9,45),y,'PropertyCrimeRate'], state.abb, ... ) } tmp.list <- list( Year=list('animate', from=1960, to=2010, delay=250) ) tmpcol <- c('blue','darkgreen','red','purple')[state.region] tkexamp( tmpfun(col=tmpcol), tmp.list ) }
data(USCrimes) ## maybe str(USCrimes) # plot time series/sparkline for each state if(require(spData) && interactive()) { data(state.vbm) plot(state.vbm) tmp.x <- state.vbm$center_x tmp.x <- c( tmp.x[1:8], 147, tmp.x[9:43], 83, tmp.x[44:50] ) tmp.y <- state.vbm$center_y tmp.y <- c( tmp.y[1:8], 45, tmp.y[9:43], -18, tmp.y[44:50] ) tmp.r <- range( USCrimes[,,'ViolentCrimeRate'], na.rm=TRUE) for(i in 1:52) { subplot( plot(1960:2010, USCrimes[i,,'ViolentCrimeRate'], ann=FALSE, bty='n', type='l', axes=FALSE), tmp.x[i], tmp.y[i], size=c(0.2,0.2) ) } } ## Gapminder style animation over time if( interactive() ) { x.r <- range( USCrimes[-c(9,45),,'Population'], na.rm=TRUE ) y.r <- range( USCrimes[-c(9,45),,'PropertyCrimeRate'], na.rm=TRUE ) tmpfun <- function(Year=1960, ... ) { y <- as.character(Year) plot( USCrimes[-c(9,45),y,'Population'], USCrimes[-c(9,45),y,'PropertyCrimeRate'], type='n', xlab='log Population', ylab='Property Crime Rate', main=y, xlim=x.r, ylim=y.r, log='x' ) text( USCrimes[-c(9,45),y,'Population'], USCrimes[-c(9,45),y,'PropertyCrimeRate'], state.abb, ... ) } tmp.list <- list( Year=list('animate', from=1960, to=2010, delay=250) ) tmpcol <- c('blue','darkgreen','red','purple')[state.region] tkexamp( tmpfun(col=tmpcol), tmp.list ) }
Plot a curve of a distribution, then using a Tk slider window adjust the parameters and see how the distribution changes. Optionally also plots reference distributions.
vis.binom() vis.gamma() vis.normal() vis.t()
vis.binom() vis.gamma() vis.normal() vis.t()
These functions plot a distribution, then create a Tk slider box that allows you to adjust the parameters of the distribution to see how the curve changes.
Check boxes are available in some cases to also show reference distributions (normal and poisson for the binomial, exponential and chi-squared for gamma, and normal for t). The exponential and chi-squared distributions are those with the same mean as the plotted gamma.
If you change the plotting ranges then you need to click on the 'refresh' button to update the plot.
These functions are run for their side effects and do not return anything meaningful.
Greg Snow [email protected]
if(interactive()){ vis.binom() vis.normal() vis.gamma() vis.t() }
if(interactive()){ vis.binom() vis.normal() vis.gamma() vis.t() }
Explore the Box-Cox family of distributions by plotting data transformed and untransformed and interactively choose values for lambda.
vis.boxcox(lambda = sample(c(-1,-0.5,0,1/3,1/2,1,2), 1), hscale=1.5, vscale=1.5, wait=FALSE) vis.boxcoxu(lambda = sample( c(-1,-0.5,0,1/3,1/2,1,2), 1), y, xlab=deparse(substitute(y)), hscale=1.5, vscale=1.5, wait=FALSE) vis.boxcox.old(lambda = sample(c(-1, -0.5, 0, 1/3, 1/2, 1, 2), 1)) vis.boxcoxu.old(lambda = sample(c(-1, -0.5, 0, 1/3, 1/2, 1, 2), 1))
vis.boxcox(lambda = sample(c(-1,-0.5,0,1/3,1/2,1,2), 1), hscale=1.5, vscale=1.5, wait=FALSE) vis.boxcoxu(lambda = sample( c(-1,-0.5,0,1/3,1/2,1,2), 1), y, xlab=deparse(substitute(y)), hscale=1.5, vscale=1.5, wait=FALSE) vis.boxcox.old(lambda = sample(c(-1, -0.5, 0, 1/3, 1/2, 1, 2), 1)) vis.boxcoxu.old(lambda = sample(c(-1, -0.5, 0, 1/3, 1/2, 1, 2), 1))
lambda |
The true value of lambda to use. |
y |
Optional data to use in the transform. |
xlab |
Label for x-axis. |
hscale |
The horizontal scale, passed to |
vscale |
The vertical scale, passed to |
wait |
Should R wait for the demo window to close. |
These functions will generate a sample of data and plot the
untrasformed data (left panels) and the transformed data (right
panels). Initially the value of lambda
is 1 and the 2 sets of
plots will be identical.
You then adjust the transformation parameter lambda
to see how
the right panels change.
The function vis.boxcox
shows the effect of transforming the
y-variable in a simple linear regression.
The function vis.boxcoxu
shows a single variable compared to
the normal distribution.
The old versions have no useful return value. If wait
is FALSE
then they will return an invisible NULL, if wait
is TRUE then
the return value will be a list with the final value of lamda
,
the original data, and the transformed y (at the final lamda
value).
Greg Snow [email protected]
GEP Box; DR Cox. An Analysis of Transformations. Journal of the Royal Statitical Society. Series B, Vol. 26, No. 2 (1964) 211-252
bct
, boxcox
in package MASS
if(interactive()) { vis.boxcoxu() vis.boxcox() }
if(interactive()) { vis.boxcoxu() vis.boxcox() }
These functions help in creating a set of plots based on the real data and some modification that makes the null hypothesis true. The user then tries to choose which graph represents the real data.
vis.test(..., FUN, nrow=3, ncol=3, npage=3, data.name = "", alternative) vt.qqnorm(x, orig=TRUE) vt.normhist(x, ..., orig=TRUE) vt.scatterpermute(x, y, ..., orig=TRUE) vt.tspermute(x, type='l', ..., orig=TRUE) vt.residpermute(model, ..., orig=TRUE) vt.residsim(model, ..., orig=TRUE)
vis.test(..., FUN, nrow=3, ncol=3, npage=3, data.name = "", alternative) vt.qqnorm(x, orig=TRUE) vt.normhist(x, ..., orig=TRUE) vt.scatterpermute(x, y, ..., orig=TRUE) vt.tspermute(x, type='l', ..., orig=TRUE) vt.residpermute(model, ..., orig=TRUE) vt.residsim(model, ..., orig=TRUE)
... |
data and arguments to be passed on to |
FUN |
The function to create the plots on the original or null hypothesis data |
nrow |
The number of rows of graphs per page |
ncol |
The number of columns of graphs per page |
npage |
The number of pages to use in the testing |
data.name |
Optional character string for the name of the data in the output |
alternative |
Optional character string for the alternative hypothesis in the output |
orig |
Logical, should the original data be plotted, or data based on the null hypothesis |
x |
data or x-coordinates of the data |
y |
y-coordinates of the data |
type |
type of plot, passed on to plot function (use 'p' for points) |
model |
An |
The vis.test
function will create a nrow
by ncol
grid of plots, one
of which is based on the real (original) data and the others which
are based on a null hypothesis simulation (a statistical "lineup").
The real plot is placed at
random within the set. The user then clicks on their best guess
of which plot is the real one (the most different from the others).
If the null hypothesis is true for the real data, then this will be a
guess with a 1/(nrow
*ncol
) probability of success. This
process is then
repeated for a total of npage
times. A p-value is then
constructed based on the
number of correct guesses and the null hypothesis that
there is a 1/(nrow
*ncol
) chance of guessing correct each
time (this will work
best if the person doing the choosing has not already seen
plots/summaries of the data).
If the plotting function (FUN
) is not passed as a named
argument, then the first argument (in the ...) that is a function
will be used. If no functions are passed then the function will stop
with an error.
The plotting function (FUN
) can be an existing function or a
user supplied function. The function must have an argument named
"orig" which indicates whether to plot the original data or the null
hypothesis data. A new seed will be set before each call to
FUN
except when orig
is TRUE
. Inside the
function if orig
is TRUE
then the function should plot
the original data. When orig
is FALSE
then the function
should do some form of simulation based on the data with the null
hypothesis true and plot the simulated data (making sure to give no
signs that it is different from the original plot).
The return object includes a list with the seeds set before each of
the plots (NA
for the original data plot) and a vector of the
plots selected by the user. This information can be used to recreate
the simulated plots by setting the seed then calling FUN
.
The vt.qqnorm
function tests the null hypothesis that a vector
of data comes from a normal distribution (or at least pretty close) by
creating a qqnorm
plot of the original data, or the same plot
of random data from a normal distribution with the same mean and
standard deviation as the original data.
The vt.normhist
function tests the null hypothesis that a
vector of data comes from a normal distribution (or at least pretty
close) by plotting a histogram with a reference line representing a
normal distribution of either the original data or a set of random
data from a normal distribution with the same mean and standard
deviation as the original.
The vt.scatterpermute
function tests the null hypothesis of "no
relationship" between 2 vectors of data. When orig
is TRUE
the
function creates a scatterplot of the 2 variables, when orig
is
FALSE
the function first permutes the y variable randomly
(making no relationship) then creates a scatter plot with the original
x and permuted y variables.
The vt.tspermute
function creates a time series type plot of a
single vector against its index. When orig
is false, the
vector is permuted before plotting.
The vt.residpermute
function takes a regression object (class
lm, or any model type object for which fitted
and resid
return vectors) and does a residual plot of the fitted values on the x
axis and residuals on the y axis. The loess smooth curve
(scatter.smooth
is the plotting function) and a reference line
at 0 are included. When orig
is FALSE
the residuals are
randomly permuted before being plotted.
The vt.residsim
function takes a regression object (class lm,
or any model type object for which fitted
and resid
return vectors) and does a residual plot of the fitted values on the x
axis and residuals on the y axis. The loess smooth curve
(scatter.smooth
is the plotting function) and a reference line
at 0 are included. When orig
is FALSE
the residuals are
simulate from a normal distribution with mean 0 and standard deviation
the same as the residuals.
The vis.test
function returns an object of class htest
with the following components:
method |
The string "Visual Test" |
data.name |
The name of the data passed to the function |
statistic |
The number of correct "guesses" |
p.value |
The p-value based on the number of correct "guesses" |
nrow |
The number of rows per page |
ncol |
The number of columns per page |
npage |
The number of pages |
seeds |
A list with 3 vectors containing the seeds set before
calling |
selected |
A vector of length |
The other functions are run for their side effects and do not return anything meaningful.
The p-value is based on the assumption that under the
null hypothesis there is a 1/(nrow
*ncol
) chance of
picking the correct plot
and that the npage
choices are independent of each other. This
may not be
true if the user is familiar with the data or remembers details of the
plot between picks.
Greg Snow [email protected]
Buja, A., Cook, D. Hofmann, H., Lawrence, M. Lee, E.-K., Swayne, D.F and Wickham, H. (2009) Statistical Inference for exploratory data analysis and model diagnostics Phil. Trans. R. Soc. A 2009 367, 4361-4383 doi: 10.1098/rsta.2009.0120
if(interactive()) { x <- rexp(25, 1/3) vis.test(x, vt.qqnorm) x <- rnorm(100, 50, 3) vis.test(x, vt.normhist) }
if(interactive()) { x <- rexp(25, 1/3) vis.test(x, vt.qqnorm) x <- rnorm(100, 50, 3) vis.test(x, vt.normhist) }
Compute the test of hypothesis and compute confidence interval on the mean of a population when the standard deviation of the population is known.
z.test(x, mu = 0, stdev, alternative = c("two.sided", "less", "greater"), sd = stdev, n=length(x), conf.level = 0.95, ...)
z.test(x, mu = 0, stdev, alternative = c("two.sided", "less", "greater"), sd = stdev, n=length(x), conf.level = 0.95, ...)
x |
Vector of data values or the mean of the data. |
mu |
Hypothesized mean of the population. |
stdev |
Known standard deviation of the population. |
alternative |
Direction of the alternative hypothesis. |
sd |
Alternative to |
n |
The sample size if |
conf.level |
Confidence level for the interval computation. |
... |
Additional arguments are silently ignored. |
Many introductory statistical texts introduce inference by using the Z
test and Z based confidence intervals based on knowing the population
standard deviation. Most statistical packages do not include
functions to do Z tests since the T test is usually more appropriate
for real world situations. This function is meant to be used during
that short period of learning when the student is learning about
inference using Z procedures, but has not learned the T based
procedures yet. Once the student has learned about the T
distribution the t.test
function should be used instead of this
one (but the syntax is very similar, so this function should be an
appropriate introductory step to learning t.test
).
An object of class htest
containing the results
This function should be used for learning only, real data should
generally use t.test
.
Greg Snow [email protected]
x <- rnorm(25, 100, 5) z.test(x, 99, 5)
x <- rnorm(25, 100, 5) z.test(x, 99, 5)
This function allows you to change the x and y ranges of the plot that is currently in the plot window. This has the effect of zooming into a section of the plot, or zooming out (unzooming) to show a larger region than is currently shown.
zoomplot(xlim, ylim=NULL ) oldzoomplot(xlim, ylim=NULL )
zoomplot(xlim, ylim=NULL ) oldzoomplot(xlim, ylim=NULL )
xlim , ylim
|
The new x and y limits of the plot. These can be
passed in in any form understood by |
This function recreates the current plot in the graphics window but with different xlim, ylim arguments. This gives the effect of zooming or unzooming the plot.
This only works with traditional graphics (not lattice/trellis).
This function is a quick hack that should only be used for quick exploring of data. For any serious work you should create a script with the plotting commands and adjust the xlim and ylim parameters to give the plot that you want.
Only the x and y ranges are changed, the size of the plotting characters and text will stay the same.
The oldzoomplot
function is the version that worked for 2.15
and earlier, zoomplot
should be used for R 3.0.
This function is run for its side effects and does not return anything meaningful.
For any serious projects it is best to put your code into a script to begin with and edit the original script rather than using this function.
This function works with the standard plot
function and some
others, but may not work for more complicated plots.
This function depends on the recordPlot
function which can
change in any version. Therefore this function should not be
considered stable.
Greg Snow [email protected]
plot.default
, par
,
matplot
, plot2script
,
source
if(interactive()){ with(iris, plot(Sepal.Length, Petal.Width, col=c('red','green','blue')[Species])) text( 6.5, 1.5, 'test' ) zoomplot( locator(2) ) # now click on 2 points in the plot to zoom in plot( 1:10, rnorm(10) ) tmp <- rnorm(10,1,3) lines( (1:10) + 0.5, tmp, col='red' ) zoomplot( c(0,11), range(tmp) ) }
if(interactive()){ with(iris, plot(Sepal.Length, Petal.Width, col=c('red','green','blue')[Species])) text( 6.5, 1.5, 'test' ) zoomplot( locator(2) ) # now click on 2 points in the plot to zoom in plot( 1:10, rnorm(10) ) tmp <- rnorm(10,1,3) lines( (1:10) + 0.5, tmp, col='red' ) zoomplot( c(0,11), range(tmp) ) }