Annotations on the Relationship Among Discriminant Functions

Different forms of discriminant functions and the essence of their appearances were considered in this study. Various forms of classification problems were also considered, and in each of the cases mentioned, classification from simple functions of the observational vector rather than complicated regions in the higher-dimensional space of the original vector were made. Violation of condition of equal variance covariance matrix for Linear Discriminant Function (LDF) results to Quadratic Discriminant Function (QDF). The relationships among the classification statistics examined were established: The Anderson’s (W) and Rao’s (R) statistics are equivalent when the two sample sizes are equal, and when a constant is equal to 1, W, R and John-Kudo’s (Z) classification statistics are asymptotically comparable. A linear relationship is also established between W and Z classification statistics.


Introduction
Discriminant analysis is a statistical method used for classification of objects into mutually exclusive and exhaustive groups on the basis of a set of independent variables. The method handles two or multiple group problems. It also derives linear combinations of the independent variables that discriminate between the a priori defined groups, such that the error rates misclassification are minimized as much as possible [1]. Thus, discriminant analysis finds a mean of classifying objects into groups with accuracy and also determines the dimensions on which the group differ [2]. Suppose Y~N 1 x N (µ, Σ), where Σ is positive definite. Then the probability density function (pdf) of Y expressed as . Several reasons have been given for the emergence of different types of discriminant functions, and notably among them are: contraventions of assumptions employed for the cradle of Fishers Linear Discriminant Function (FLDF); efforts to reduce as much as possible, the derivatives of errors of misclassification; efforts to get permissible methods that minimize probabilities of misclassification and also as an underline issue, testing of hypothesis.
This study therefore considers different forms of classification statistics, instances of classification problems and correlations existing among the discriminant functions.

Examples of Classification Problems
(i) A large international Air Carrier collects data on employees in three different job classification (customer service personnel, mechanic and dispatches). The Director of Human Resources may wish to know if these three job classifications appeal to different personality types [3]. (ii) A number of variables are measured at five weather stations. Based on these variables, we may wish to predict the ceiling at a particular airport in 2 hours. The ceiling categories are closed, low instrument, high instrument, low open and high open. (ii) In a brand-switching system, one may wish to detect fast and slow consumers of newly introduced product on the basis of consumers' characteristics such as education, income, family size and amount of previous brand-switching. (iv) A nutritionist may desire to classify different classes of food into distinct category of food nutrients such as carbohydrates, fats and oil, vitamins, proteins, minerals, etc., on the basis of measurements of amount of different nutrients in the food. (v) Astronomers have been cataloguing distant objects in the sky using long exposure CCD images. The objects need to be labeled as star, galaxy, nebula, etc. The data is highly noisy, and the images are very faint. The cataloguing can take decades to complete. Can an automated cataloguing process be designed to improve its effectiveness and efficiency? (vi) In a hospital, a patient is admitted with a diagnosis of myocardial infarction, and systolic blood pressure, heart rate, stroke index and mean arterial pressure are obtained by the Doctor. Is it possible to predict whether the patient will survive? On the basis of these measurements, can we compute a probability of survival for the patient? [4,5]. (vii) In an anthropological study, an Archeologist obtained a jawbone excavated from a burial ground as having belonged to a male or female. Can an assignment be made on the basis of measurements such as circumference and volume made on the jawbones from the two sets of people? [6]. (viii) A Geologist obtained the mean, variance, skewness and kurtosis of the size of particles deposited in a beach. How can these statistics be used to determine if the beach is wave-laid or Aeolian in origin? Are there differences in particle size distribution? (ix) An emergency room in a hospital measures a number of variables such as blood pressure, age, etc of newly admitted patients. A decision has to be taken whether to put the patient in an Intensive Care Unit (ICU). Due to the high cost of ICU, those patients who may survive a month or more are given higher priority. The problem is to predict high risk patients and discriminate them from low-risk patient [3]. (x) A credit card company receives hundreds of thousands of applications for new cards. The application contains information regarding several different attributes, such as annual salary, any outstanding debts, age, etc. The problem is to categorize applications into those who have good credit, bad credit, or fall into a gray area [3]. (xi) African or "killer bees" cannot be distinguished visually from ordinary domestic honey bees. What kind of variables based on chromatograph peaks can be used to readily identify them? [7] (xii) A meteorologist wants to predict cloud ceiling at time, 1 , t on the basis of physical measurements acquired at time, 01 ,, tt where 01 tt  . In this case, it is assumed that historic data are readily available to assist in determining an assignment rule [1]. In each of the cases mentioned, we wish to classify from simple functions of the observational vector rather than complicated regions in the higher-dimensional space of the original vector.

Methods of Classification Statistics
Different types of discriminant functions and some of their properties are appraised in this study:

Linear Discriminant Function (LDF)
The Linear Discriminant Function (LDF) is a statistical procedure constructed as     ; , ; , 12 ,, X X S are estimates of 12 ,,  respectively. Suppose the assumptions specified above are satisfied, then the Linear Discriminant Function (LDF) provides optimal assignment rule in that it cannot be improved upon and the errors of misclassification are minimized. However, when some or all the assumptions are violated, it would be of interest to researchers to determine the effects of the violation on the procedures using LDF. Fisher [8], Welch [9] and Wald [10] established that LDF optimal properties for two group classification of the populations are multivariate normally distributed.

Quadratic Discriminant Function (QDF)
When the assumptions of equal covariance matrices from two populations are violated, QDF arises and the derivation is established using likelihood ratio rule. If the parameters are known, the classification statistic is expressed as  The sturdiness of QDF was studied by Gilbert [11] when 1  and m  ( m is a constant) and also by Lachenbruch, et al. [12] in respect of non normality. The statistic (QDF) is optimal when the population parameters are known and 12    .

Anderson's Statistic (W)
The statistic arose from Anderson's derivation in respect of two population parameters that are multivariate normally distributed with different means and constant covariance matrix [13,14]. When the population parameters are known, the W is defined as       , The statistic (W) is different from LDF by a constant,    

Best Linear Classification Statistic (BLCS)
The statistic was introduced by Clumies-Ross and Riffenburgh [15] and Anderson and Bahadur [16] under the same condition with QDF. The statistic is expressed as:  