by Nikolai V. Shokhirev
Statistics | ABC Data processing | Home
In science and technology systems (objects) are characterized by a finite set of parameters: xi , i = 1, ... , N. Consequently, the measurements for these parameters can be arranged into a rectangular matrix:
|
(1) |
Here M is the number of experiments. Each experiment corresponds to the measurement of a system , sample, individual, etc. All such terms are used interchangeably.
Examples
| Systems | Parameters |
| Human individuals | Age, sex, education, income, weight, height, etc. |
| Chemical solutions | Spectral intensities at selected wavelength |
| Microchips in a control sample | Voltage and current at certain pins |
| Clinical test participants | Lab test results |
The sample (population) mean vector of parameters is defined as:
|
(2) |
For each measurement the vector of deviations can be defined as:
| (3) |
In the case of clinical research, one of the components of μ is an average patient temperature in a hospital. Obviously, more interesting is a deviation from this average.
The vectors of deviations form the matrix D similar to the initial matrix X:
| (4) |
The sample covariance matrix is defined as averaged products of the deviation vector components:
|
(5) |
Here di,m is the i-th parameter of the m-th system.
Eq (5) can be rewritten in the following matrix form:
| (6) |
The superscript "T" denotes the matrix transposition.
The maximum likelihood covariance matrix CML differs by the factor M /(M-1) from the above definition:
.
|
(7) |
The advantage of this definition is that the i-th diagonal element is the estimation for the variances of the i-th parameter:
| (8) |
Regardless of the covariance definition, the correlation coefficients are:
|
(9 |
or in a matrix form:
| (10) |
Here
is a diagonal matrix with the following matrix elements:
| (1) |
The correlation coefficient is a measure of the quality of a linear least squares fit for the original data. A higher σ value means a better linear fit.
This approach is implemented in a program called "Correlations". This program is available in the Download section.
Remark: In "Correlations" the meaning of the columns and rows is opposite to that of the tutorial.
Statistics | ABC Data processing | Home