cdplot {graphics} | R Documentation |

Computes and plots conditional densities describing how the
conditional distribution of a categorical variable `y`

changes over a
numerical variable `x`

.

cdplot(x, ...) ## Default S3 method: cdplot(x, y, plot = TRUE, tol.ylab = 0.05, bw = "nrd0", n = 512, from = NULL, to = NULL, col = NULL, border = 1, main = "", xlab = NULL, ylab = NULL, yaxlabels = NULL, xlim = NULL, ylim = c(0, 1), ...) ## S3 method for class 'formula': cdplot(formula, data = list(), plot = TRUE, tol.ylab = 0.05, bw = "nrd0", n = 512, from = NULL, to = NULL, col = NULL, border = 1, main = "", xlab = NULL, ylab = NULL, yaxlabels = NULL, xlim = NULL, ylim = c(0, 1), ..., subset = NULL)

`x` |
an object, the default method expects either a single numerical variable. |

`y` |
a `"factor"` interpreted to be the dependent variable |

`formula` |
a `"formula"` of type `y ~ x` with a single dependent
`"factor"` and a single numerical explanatory variable. |

`data` |
an optional data frame. |

`plot` |
logical. Should the computed conditional densities be plotted? |

`tol.ylab` |
convenience tolerance parameter for y-axis annotation. If the distance between two labels drops under this threshold, they are plotted equidistantly. |

`bw, n, from, to, ...` |
arguments passed to `density` |

`col` |
a vector of fill colors of the same length as `levels(y)` .
The default is to call `gray.colors` . |

`border` |
border color of shaded polygons. |

`main, xlab, ylab` |
character strings for annotation |

`yaxlabels` |
character vector for annotation of y axis, defaults to
`levels(y)` . |

`xlim, ylim` |
the range of x and y values with sensible defaults. |

`subset` |
an optional vector specifying a subset of observations to be used for plotting. |

`cdplot`

computes the conditional densities of `x`

given
the levels of `y`

weighted by the marginal distribution of `y`

.
The densities are derived cumulatively over the levels of `y`

.

This visualization technique is similar to spinograms (see `spineplot`

)
and plots *P(y | x)* against *x*. The conditional probabilities
are not derived by discretization (as in the spinogram), but using a smoothing
approach via `density`

.

Note, that the estimates of the conditional densities are more reliable for
high-density regions of *x*. Conversely, the are less reliable in regions
with only few *x* observations.

The conditional density functions (cumulative over the levels of `y`

)
are returned invisibly.

Achim Zeileis Achim.Zeileis@R-project.org

Hofmann, H., Theus, M. (2005), *Interactive graphics for visualizing
conditional distributions*, Unpublished Manuscript.

## NASA space shuttle o-ring failures fail <- factor(c(2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1), levels = 1:2, labels = c("no", "yes")) temperature <- c(53, 57, 58, 63, 66, 67, 67, 67, 68, 69, 70, 70, 70, 70, 72, 73, 75, 75, 76, 76, 78, 79, 81) ## CD plot cdplot(fail ~ temperature) cdplot(fail ~ temperature, bw = 2) cdplot(fail ~ temperature, bw = "SJ") ## compare with spinogram (spineplot(fail ~ temperature, breaks = 3)) ## scatter plot with conditional density cdens <- cdplot(fail ~ temperature, plot = FALSE) plot(I(as.numeric(fail) - 1) ~ jitter(temperature, factor = 2), xlab = "Temperature", ylab = "Conditional failure probability") lines(53:81, 1 - cdens[[1]](53:81), col = 2)

[Package *graphics* version 2.5.0 Index]