The usage of classification method in market research Essay
The usage of classification method in market research, 500 words essay example
Essay Topic: research, classification
Classification is one of the method that is often used in machine learning and statistical analysis. It is a process to group data according to its similarities which is widely used in many fields including direct marketing, e-commerce, market segmentation, medical and social sciences, environmental research and so on. For instance, classification is widely used in direct marketing where direct contact through interactive communications is made to acquire and retain customers. In direct marketing, classification is useful in classifying products into meaningful categories which helps to determine the consumers' buying behavior. This aids in the development of strategies and methods to execute an effective marketing plan that will help promote a business's product or services.
Apart from classification, the need to identify the relationship between a response variable and a set of explanatory variables is always a common research situation in data mining. In most situation, multiple regression is widely used especially when the variables involved are quantitative and able to be measured in continuous scale. However, when the variables involved are in categorical form such as nominal variable and ordinal variable, multiple regression might be inappropriate as the assumptions for a regression to be valid are not met. This situation is always faced by market researchers who often work with variables that are in categories such as gender, education level, race and so on. For example, a direct marketer who sells subscription to a magazine would like to maximize his profits by identifying the household segments where the potential customer belongs to. By doing this, he can know the characteristics of the potential customers who are most likely to respond for the magazine subscription when the promotion is sent. In this case, a method called CHAID (Chi-squared Automatic Interaction Detector) can be applied to perform segmentation modelling in which the population is divided into segments that differ with respect to the designated criterion. It is used in data mining, as a predictive model to make conclusions about a target variable or dependent variable based on a set of predictors besides discovering the relationship between variables.
In this study, a comparison between CHAID classification tree and Logistic Regression (LR) will be carried out to see which methods best describe the bank direct marketing data. Analysis using CHAID and LR will be conducted since the two methods are comparable in obtaining objectives of the study. LR is a statistical method used to analyze a set of independent variables that determine a binary outcome where only two possible outcomes will be produced. The dependent variable is dichotomous and only contains data coded as 1 which indicates success (occurrence of an event) or 0 indicating failure (nonoccurrence of an event). A logit transformation of the probability of presence of the characteristic of interest is also predicted from the coefficients of the formula generated by logistic regression. In this research, the bank direct marketing data consists of binary dependent variable. Thus, both LR and CHAID classification tree are appropriate to be used to analyze the data.