Module title: FIN30290 Recent Research Topics in Finance Group assignment title: Bank telemarketing and machine learning
Finance金融作业代写 You will be assessed on your ability to respond to questions raised below, i.e., to use machine learning informed alert model algorithms
Instructions Finance金融作业代写
You should work in a group, of circa 5 students, and each group should nominate an individual to submit a single project report. Maximum word-count for the report is 3,000 words.1 Front page of submitted assignment should detail module title, membership of the group (i.e., students’ names and numbers) as well as the assignment title, and the assignment word count. Please submit this project by Friday, May 6 via Brightspace ‘Assessment’ and the ‘Assignment’ Project. In addition, email your report and program code to: [email protected] and cc: [email protected]. State ‘BSc Group Assignment: Bank telemarketing and machine learning’ in the subject of the mail, and cc: all group members. Please do not collaborate, in this assignment, across groups.
Assessment and grades
You will be assessed on your ability to respond to questions raised below, i.e., to use machine learning informed alert model algorithms, critically evaluate the performance of these methods, and coherently report your findings. This project counts for 40% of your overall module grade.
Assignment Context Finance金融作业代写
An important source of income at banks is the term deposit, i.e., deposits by customers at a fixed rate for a fixed time. This capital can be used to disburse loans at a higher interest rate. The bank, hence, uses marketing techniques to target customers to save via term deposits. For example: email, advertisement, telephonic and digital marketing. Telephonic marketing (i.e., phone calls) remains an effective way to acquire term deposit customers, especially if enabled with machine learning. Banks can use data and machine learning informed alert models to identify customers who are more likely to save via a term deposit, and to inform a telephonic marketing campaign accordingly.
The dataset, in this assignment, is related to the direct telemarketing campaigns (phone calls) of a European banking institution. You can find the data for the project on Brightspace (MyLearning \ Group Assignment) a n d variable descriptions below.
The classification goal is to predict if the customer will subscribe to a term deposit (Term
1 Word count includes an assignment’s references section.
Deposit = 1). Tapping into the repertoire of your Machine Learning modelling, evaluation and deployment knowledge, provide recommendations to the bank’s Retail Marketing department to achieve its goal.
Questions
(a) Fit a logistic regression model on the dataset. Choose a probability of default threshold of 1 0 % , 20%, 35% and 50%, to assign an observation to the Term Deposit = 1 class. Compute a confusion matrix for each of the models. How do the True Positive and False Positiverates vary over these models? Which model would you choose?
(b)Dividethe dataset into training (70%) and test (30%) sets and repeat the above question and report the performance of these models on the test set.
(c)Plot the ROC for a logistic model on a graph and compute the AUC. Explain the information conveyed by the ROC and the AUC metrics.[8 marks]
2.(a) Fit classification tree, bagging and random forest models on the dataset and comment on the performance of these models. Do you think we are overestimating the performance of thesemodels by fitting them on to the whole dataset? If so, state your reasons. Finance金融作业代写
(b)Splitthe dataset in two parts: training (70%) and test sets (30%). Fit the models on the training dataset and evaluate their performance on the test Which model would you choose and why?
(c)Forthe best model chosen, rank and plot predictors according to their predictive power.
(d)How do these models perform compared to the model in question1?[9 marks]
3.(a) Standardize your predictors and fit KNN classifier with K equal to 1, 3, 5 and 10, respectively. Evaluatethe performance of these models on the test set.
(b)Howdo these models perform compared to the tree-based models in question 2 and logistic model of question 1?[8 marks]
4.(a) Fit at least one other binary classifier (e.g. a linear probability model or a Support Vector Machine classifier) to the Describe its performance relative to the classifiers highlighted above.
(b)Is your training dataset balanced? Comment on the drawbacks of fitting a Machine Learning technique on an unbalanced dataset. Can you identify and deploy a technique to address this concern? If so, why do you think that the method(s) could work? Hint: It is up to each student group to search for a systematic understanding and solution to the phenomenon of imbalanced data.[15 marks]
Description of the Dataset
Variable Name | Description | Category |
Term deposit | Has the client subscribed a term deposit? 1 if yes, 0 if no. | Binary (‘1’, ‘0’) |
Age | Age of Customer in Years | Numeric |
Job | Job Status | Categorical |
Marital | Marital Status | Categorical |
Education | Level of education | Categorical |
Default | Default Status – is there a history of default | Categorical |
Housing | Has availed Housing Loan? | Categorical |
Loan | Has availed Personal Loan? | Categorical |
Contact | Contact communication type (landline vs mobile phone) Finance金融作业代写 | Categorical |
Month | Last contact month of year | Categorical |
Day_of_week | Last contact day of the week | Categorical |
Duration | Last contact duration, in seconds # | Numeric |
Campaign |
# of contacts performed during this campaign and for this client (includes last contact) |
Numeric |
Pdays |
# of days that passed by after the client was last contacted from a previous campaign (999 means client was not contacted in a previous campaign) |
Numeric |
Previous | # of contacts performed before this campaign and for this client | Numeric |
Ethnicity | Caucasian is the reference ethnic category: Is the customer of African ethnicity? 1=Yes, 0=No; | Numeric |
Poutcome | Outcome of the previous marketing campaign | Categorical |
Emp.var.rate | Employment variation rate: quarterly indicator | Numeric |
Cons.price.idx | Consumer price index: monthly indicator | Numeric |
Cons.conf.idx | Consumer confidence index: monthly indicator | Numeric |
Euribor (3m) | Euribor 3 month rate | Numeric |
Nr.employed: | # of employees: quarterly indicator | Numeric |
# Duration: telephone call duration of the Term Deposit outcome call, in seconds (numeric). Important note: this attribute can strongly impact the outcome variable (e.g., if duration=0 then Term Deposit=’0′). Thus, this input should only be included for benchmark purposes and should be discarded if the intention is to derive a pragmatic predictive model. |
更多代写:Java代写价格 多邻国代考 Management 管理学 law格式mla代写 MBAEssay代写 代写总结枪手
合作平台:essay代写 论文代写 写手招聘 英国留学生代写