美国PYTHON代写-Financial analysi代写-data science代写
美国PYTHON代写

美国PYTHON代写-Financial analysi代写-data science代写

Case analysis

美国PYTHON代写 Family is a discrete variable, with number ranged in [1,2,3,4]. All the other variables are continuous variables.

Part (a) Data Cleaning and Basic Data Exploration

I have imported the excel data using pandas library in python. The basic descriptive statistics for columns with numerical data is shown in the table below:

Age Experience Income Family CCAvg Mortgage Personal Loan Brokerage Account GIC Account Online CreditCard
count 4499 4499 4499 4499 4499 4499 4499 4499 4499 4499 4499
mean 47.2683 77.2683 74.3034 2.39809 1.95183 56.0856 0.10669 0.108691 0.0649033 0.592132 0.296733
std 11.4547 11.4547 46.3106 1.14704 1.75648 101.788 0.308754 0.311286 0.246383 0.491493 0.456868
min 25 55 8 1 0 0 0 0 0 0 0
25% 37 67 39 1 0.7 0 0 0 0 0 0
50% 47 77 64 2 1.6 0 0 0 0 1 0
75% 57 87 99 3 2.6 99.5 0 0 0 1 1
max 69 99 224 4 10 617 1 1 1 1 1

From the descriptive statistics, we can see that there are a few binary data:

  • Personal Loan
  • Brokerage Account
  • GIC Account
  • Online
  • Credit Card

Family is a discrete variable, with number ranged in [1,2,3,4]. All the other variables are continuous variables. From the distribution of the data, we observed that little clients have mortgages, and their income are more likely to be in power-law distribution (skewed) rather than normal.

Part (b) visualization  美国PYTHON代写

Visualization 1: Correlations between features

The first visualization is the pairwise scatter plot and histogram for the numerical and boolean features.

The above is a pairwise scatter plot of some numerical and boolean features. We can see that:

1.Experience and Age are basically correlated perfectly, we will drop one of them.

2.The Income level doesn’t correlate to Mortgage, Family Size and use of other financial products in the bank including credit card.

Visualization 2: features grouped by result

Since we want to see which type of clients are more likely to accept personal loan promotions, we therefore grouped by the result and plotted the distribution for each features above. Note that we saw Experience are highly correlated with Age, then we removed this feature.

美国PYTHON代写
美国PYTHON代写

Above is the overlayed histogram, grouped by Personal Loan

We can see that:

1.People with higher income are more likely to use Personal Loan.

2.The likelihood of getting a personal loan increases with the family size .

3.People with higher CCAvg are more likely to use Personal Loan.

Visualization 3: Does bank branch make a difference

We are also interested in whether some branches are better at promoting personal loans to their clients. Therefore we showed each branch’s total mortgage outstanding, as well as the percentage of clients who accepted the personal loan promotion:

we can see that Branches with larger business volume don’t typically have a higher conversion rate. In fact, the conversion rate is unrelated to the size of their business.

Visualization 4: Does adviser make a difference

The same visualization then applied to advisors.

美国PYTHON代写
美国PYTHON代写

Similarly, advisors with most clients (in terms of mortgage demand) didn’t have the highest conversion rate as well.

Part (c) Business analytics  美国PYTHON代写

We will group by Advisor name and then calculate how successful each advisor was in terms of:

  • selling most Personal Loan products
  • highest conversion rate

And below are the 10 most successful advisors by each of the standard

Sold most personal loan products

Advisor Name Personal Loan
Gita Pinelli 16
Kathaleen Horgan 15
Prudence Masters 15
Jacqueline Leveque 14
Corazon Eastin 13
Erik Clinard 12
Vicki Sowers 12
Val Sauceda 12
Siobhan Flaugher 12
Carylon Race 12

Highest conversion rate

Advisor Name Personal Loan
Prudence Masters 0.182927
Kathaleen Horgan 0.159574
Gita Pinelli 0.152381
Eulah Kicklighter 0.148148
Jaquelyn Cubbage 0.144578
Weston Jeon 0.142857
Jacqueline Leveque 0.142857
Val Sauceda 0.139535
Corazon Eastin 0.138298
Carylon Race 0.136364

Part (d) Artificial Intelligence  美国PYTHON代写

In this part we will try to fit 3 different models to the data and predict whether a customer will respond positively to a promotion activity and apply for a Personal Loan (Personal Loan).

The features we used are based on historical behavior of the customers:

  • Age
  • Experience
  • Income
  • Family
  • CCAvg
  • Mortgage
  • Brokerage Account
  • GIC Account
  • Online
  • CreditCard

And we will classify the customers into Personal Loan = 1 and Personal Loan = 0. The dataset will be randomly split into 80% training set and 20% testing set. Because this result is binary, so we tried 3 different models:

Logistic Regression

The logistic regression model got the following classification result:

precision    recall  f1-score   support

0       0.94      0.98      0.96       802
1       0.70      0.47      0.56        98

accuracy                           0.92       900
macro avg       0.82      0.72      0.76       900
weighted avg       0.91      0.92      0.91       900

It has an average accuracy of 92%, which is not bad, however we can see that the recall rate for people who would want a personal loan product is merely 47%. That means we will fail to locate more than half of selling opportunities.  美国PYTHON代写

Support Vector Machine

The SVM model got the following results:

precision    recall  f1-score   support

0       0.90      0.99      0.94       802
1       0.67      0.10      0.18        98

accuracy                           0.90       900
macro avg       0.78      0.55      0.56       900
weighted avg       0.88      0.90      0.86       900

The performance of SVC is not as good as logistic regression. It had a strong bias towards personal_loan=0 result.

Random forest

Finally, we tried the random forest model and it appeared to be the best model among the 3:

precision    recall  f1-score   support

0       0.96      0.99      0.97       802
1       0.92      0.62      0.74        98

accuracy                           0.95       900
macro avg       0.94      0.81      0.86       900
weighted avg       0.95      0.95      0.95       900

The random forest model performed the best, with on average 95% accuracy and most importantly, 62% recall on people who “would” actually apply for Personal Loan, which is the highest of 3 candidate models.

 

更多代写:澳大利亚网课代考价格  sat替考  软件工程专业Assignment代写  美国report论文代写  美国大学论文字数  英国大学网课

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

美国PYTHON代写
美国PYTHON代写

发表回复