
The Hyades & surrounding stars
- Acquire a VOTable from the Hipparcos Main Catalogue with stars
in and around the Hyades.
It has been made available as a
local copy.
In general you can get this or any other catalog from
Vizier by following the steps below.
http://vizier.u-strasbg.fr/cgi-bin/VizieR?-source=I/239/hip_main
Choose Query setup: 9999 maxiumum, XML-VOTable(DTD) layout
Output preferences: Compute nothing, Sort by Position
Query: Show 4 columns -- Vmag, pmRA, pmDE, and B-V
Constraints -- RA(ICRS) = 0 .. 120, Plx = 18 .. 34
This generates a table
(locally
called hyades_hip.vot) of 2212 bright stars with 0<RA<8
hours and distance 29<d<56 parsecs. The columns
give the magnitude, color, and proper motion vector for each star.
Note that the B-V value is missing for ~1% of the stars. Note also
that Vizier adds two identifying columns after the 4 requested
columns, recno & HIP, which are identical. They can be considered
to be the rank in RA.
- V magnitude descriptive statistics
Process: Mean, median & Shapiro-Wilk's test
Select 2 columns with similar units, Vmag &; B-V, and process boxplot
These quickly show some basic characteristics of a univariate
dataset. We find the mean +/- s.d. = 8.22 +/- 1.89 and the
quartiles are 6.99 (25%), 8.22 (50%, median), and 9.56 (75%). But
even though the mean and medians are the same, the distribution is
not at all consistent with a Gaussian (P=10^-7). The boxplot
shows a few bright-star outliers with V<3 magnitudes.
- V magnitude density estimation
Process: Histogram
The "density" is the statistician's term for the differential
distribution function. Looking at the histogram, one can now
clearly see the non-Gaussian asymmetry in the V distribution.
Compare the histogram with Sturges' and Scott's rules for bin
width .
Process: Kernel smoothing
Try Gaussian and rectangular (i.e. boxcar) convolution functions;
note the latter is noisier.
The help file
provides details on the convolution method.
The user can also choose bandwidths.
- Bivariate correlations
Select: 5 columns (Vmag, pmRA, pmDE, B-V, recno)
Process: Correlation matrix
This gives the strength of correlation between pairs of variables
using Pearson's linear correlation coefficient and Spearman's rho.
For each statistic, the output gives the statistic values, the
number of points (not the missing B-V entries), and the
probability that no correlation is present. Note the dataset is
dominated by correlations. Note also that Pearson's statistic is
insensitive to the subtle link between the proper motion
components.
Select: 2 columns (Vmag and B-V)
Process: X-Y plot
Here we display the most famous structure in the
dataset: the color-magnitude of bright stars showing the main
sequence, giant branch (with red clump stars), and a few Hyades
white dwarfs.
Select: B-V and pmDE
Process: X-Y plot
Here we explore a less familiar scatter plot for subtler effects.
This plot of color vs. proper motion shows
non-Gaussian distributions with strong heteroscedasticity (i.e.
the scatter, but not the mean, of the proper motion depends on
color). This is because the hotter stars are kinematically
younger. A plot of B-V vs. recno (i.e. right ascension) shows a
small clump of hot Hyades members.
- Multivariate clustering
Select: 4 columns (Vmag, B-V, pmRA, pmDE)
Process: Agglomerative nesting (agnes)
This is one of many classification techniques, several of which
should be implemented into VOStat in the future. Here the stars
"closest" to each other in the 4-dimensional space are
sequentially merged into progressively larger groups. The output
is a dendrogram showing the "leaves" merging into "branches" and
finally a "trunk". The scientist must decide whether "closeness"
is measured in a Euclidean (distance squared) or Manhatten
(distance) metric, and how group are defined. Single linkage,
which astronomers call the "friends-of-friends" algorithm,
produces stringy clusters, while complete linkage produces
hyperspherical clusters and average linkage (a common choice) is
in between. Ward's criterion (another common choice) produces the
maximum-likelihood cluster discrimination under the hypothesis of
multinormal structures.
Training set: One can explore the
relationship between the
Hipparcos dataset
of 2212 stars, which
have non-Hyades stars, and a more
carefully selected sample of Hyades members.
The votable contains ~450 Hyades
members discussed by I.N. Reid (1992) with Vmag, B-V, pmRA and
pmDE. Beware that the columns are in different order, and the
proper motion units differ by a factor of 10, from the Hipparcos
VOTable dataset. It is userful to note that these confirmed
Hyades members have kinematics in the range pmRA = 110 +/- 40
mas/yr and rmDE = -30 +/- 40 mas/yr.