Data Mining Principles代写-数据挖掘原理代写
Data Mining Principles代写

Data Mining Principles代写-数据挖掘原理代写

Assignment 3

Data Mining Principles代写 Perform latent class analysis of only the categorical variables for market segmentation using (function poLCA in package poLCA).

Part 1  Data Mining Principles代写

Use the GermanCredit data (package caret) in R or from UCI Machine Learning’s website.

Latent Class Analysis Homework

1.Perform latent class analysis of only the categorical variables for market segmentation using (function poLCA in package poLCA). Remember: the local optima problem is big for all the clustering and latent class methods. The data for analysis should only include the variables that you think have business relevance for market segmentation.

2.Determine 2, 3,..,K class/cluster solutions. Remember to run from multiple random starts. Use AIC criterion and interpretation based on graphs to interpret LCA solutions.

3.Perform Test validation of LCA.

a.For Test, use the centers class-conditional probabilities – probs – from training set as input to probs.start for test (generated from the training set LCA solution, as the starting point for test. Use similarity of relative class sizes and test class conditional probabilities as measures of stability.  Data Mining Principles代写

4.Look at the marginal distribution from the plot. Try to give a name to each of the classes.

5.Was naming the classes easy? What was the difficulty? State in a few sentences

Each of the (5) parts is worth 1 point each. TOTAL: 5 points.

Data Mining Principles代写
Data Mining Principles代写

Part 2  Data Mining Principles代写

Please use the Boston Housing data for this assignment.

Principal Components Homework

1.Split sample into two random samples of sizes 70% and 30%.

2.Perform principal components of numeric variables from the Boston Housing Data on training sample.

3.Generate Scree Plots and select number of components you would retain.

4.Plot Component 1 loadings (x-axis) versus Component 2 loadings (y-axis). Use this plot to interpret and name the Components. Repeat this by plotting Component (1) separately versus all components you decided to retain from Step 3 (Component 3,Component 4 etc). Can you interpret each of the components you decide to retain. In case a component is not interpretable, note that.    Data Mining Principles代写

5.Perform the following:

a.Show that Component loadings are orthogonal.

b.Show that Component scores are orthogonal.

c.Perform Test validation of Principal Components solution.

i.For Test validation, you will have to

1.predict the component scores in the Test [ using the predict()

function in R and transform function in Python

2.matrix multiply the predicted component scores from (1) above

with transpose of component loadings you derived from training

data set from Step 2 above. Refer to Page 52 of Class Lecture for

Session 4 for details.

d.Compute the Variance Account For (R2 ) in the Test sample. That yields a measure of Test performance.

e.[OPTIONAL] Rotate the component loadings using varimax rotation. Look at the Loadings from the varimax rotation. Does it yield any different Interpretation of the Principal Components? Python Users: Current package is a month old, and buggy.

f.[OPTIONAL] Plot rotated loadings(1) versus rotated loadings (2) and (3). Do you think Principal Components reduced this data a lot? Do you like the solution?


更多代写:北美留学生代写  proctoru作弊  北美化学quiz代考  信息技术Essay代写  美国酒店管理论文代写   代做机器学习作业

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

Data Mining Principles代写
Data Mining Principles代写