Data Mining代写 – 数据挖掘代写 – dataset代写 – MS6711
Data Mining代写

Data Mining代写 – 数据挖掘代写 – dataset代写 – MS6711

MS6711 Data Mining

 

 

Data Mining代写   This is an individual assignment. You are expected to finish the work by yourself. Discussions between classmates are encouraged, but you···

 

Assignment 1 Data Mining代写

  • This is an individual assignment. You are expected to finish the work by yourself. Discussions between classmates are encouraged, but you must not cross the line from discussion to collusion.
  • This assignment must be done with SAS EM 14.2 (or later) in the 64-bit Windows environment. The diagram of the project must be encoded in the ‘wlatin1 western’ language. If your submitted file cannot be unzipped or the EM project cannot be reopened for any reason, your marks for the assignment will be 0.
  • This assignment is due on 24 February 2019, at 11:55pm. Read the later descriptions for submitting your work. Late submission will not be accepted for any reason.

Assignment Description Data Mining代写

The CityU Bookstore offers its customers books in a number of categories. CityU runs a customer loyalty program through which the registered customers are eligible for discounts and occasional gifts. CityU has 50,000 registered customers. To stimulate sales, CityU launched a sales campaign where 5,000 customers selected at random, have received $100 gift vouchers. The vouchers were valid for a specific period of 10 days. The following simplifying assumptions were made for this assignment.

  • All purchases made by customers who received the voucher during the campaign period (20-31 December 2018) are assumed to be direct results of the campaign.
  • The value of the purchases is ignored in determining the response behavior; that is, it is immaterial whether the purchases exceeded the $100 value during the campaign period.

The data available for this assignment is the following SAS datasets: Data Mining代写

  • sas7bdat contains the details (city of residence, gender, customer ID, data of birth (DOB), and enrollment date (EnrolDate) of the loyalty program of the 50,000 registered customers.
  • sas7bdat holds the aggerated purchase records of each customers in the year 2018. Each record has the following variables:
Variable Description
CustID Customer ID.
CompBooks Number of computer books purchased before 20 December 2018.
ChildBooks Number of children books purchased before 20 December 2018.
TravelBooks Number of travel books purchased before 20 December 2018.
FictionBooks Number of fiction books purchased before 20 December 2018.
DiyBooks Number of DIY books purchased before 20 December 2018.
Otherbooks Number of other books purchased before 20 December 2018.

CompSpnd

Amount spent on computer books before 20 December 2018.
ChildSpnd Amount spent on children books before 20 December 2018.
TravelSpnd Amount spent on travel books before 20 December 2018.
FictionSpnd Amount spent on fiction books before 20 December 2018.
DiySpnd Amount spent on DIY books before 20 December 2018.
OtherSpnd Amount spent on other books before 20 December 2018.
PayMtd1 – PayMtd5 Number of each payment method (1-5) used for books purchased before 20 December 2018.
PaySum1 –

PaySum5

Amount of each payment method (1-5) used for books purchased before 20 December 2018.
SpndPrm Amount spent during 20-31 December 2018.
  • sas7bdat contains the 5,000 IDs of the customers who received the $100 voucher. Data Mining代写

Data Mining代写
Data Mining代写

You are asked to perform the following tasks and run all involved nodes:

  1. Create an EM project with the following settings:
    1. Name the project Assignment1.
    2. Create a file folder named Rawdata within the project folder Assignment1.
    3. Put data sets Customer.sas7bdat, sas7bdat, PurchaseRecords.sas7bdat, and Campaign.sas7bdat into the folder Rawdata.
    4. Write and save appropriate statements in the Start Code property of the project for assigning a SAS Library named Datasets to the folder Rawdata.
  1. Import the above 3 data sets into the project via the library Datasets. Assign appropriate measurement level and role for each variable according to the nature of the variable.

AMD Data Mining代写

  1. Create a diagram named Diagram1 in the project. Then drag the three data sets into the diagram and run each data node. All subsequence tasks must be completed inside this diagram.
  2. Perform the following activities for the Customers Input Data node.
    1. Connect Customers Input Data node to a Replacement node. Use the Replacement node to replace all values of variable Gender, except Male value and Female value, to Unknown. Do not change the values of other variables. How many records were replaced? Type your answers in a SAS code node. {Hints: No need to connect the SAS node with other nodes. Type your answer in its Training Code window. Confirm to save before closing the node. Do not run the SAS node.}
    2. Connect the Replacement node in Task 4ato a Transform Variables node. Use the Transform Variables node to create the following numeric variables. Do not hide or reject the original variables in the output.
      1. Variable Tenure, where Tenure = number of years (in 2 decimal places) from enrollment date of the loyalty program till the 31stof December 2018. What are the minimum and maximum values of Tenure? Type your answer in the SAS Code node created in Task 4a. {Hint: Use the formula to create the required variable: Tenure = Round((’31Dec2018’d – EnrolDate) / 365.25, 0.01). SAS function Round is used to round a numeric value to 2 decimal places.}
    3. AMD Data Mining代写

      1. Variable JoinAge, where JoinAge = Customer’s age in years (in 2 decimal places) when joining the loyalty program. What are the minimum and maximum values of JoinAge? Type your answer in the SAS Code node created in Task 4a.
    4. No customer younger than 16 is allowed to participate in the loyalty program. Therefore, JoinAge values less that 16, as a data entry error, will be forced to be 16. Connect the Transform Variables node in Task 4bto a Replacement node. Use the Replacement node to replace the values of JoinAge as appropriate. How many records were replaced? Type your answer in the SAS Code node created in Task 4a.
    5. Connect the Replacement node in Task 4cto a Transform Variables node. Use the Transform Variables node to create a numeric variable Age, where Age = Rep_JoinAge + Tenure. What are the minimum and maximum values of Age? Type your answer in the SAS Code node created in Task 4a.

AMD Data Mining代写

  1. Connect the PurchaseRecords Input Data node to a Transform Variable node. Use the Transform Variable node to create two numeric variables, namely TotalBooks and TotalSpnd, where TotalBooks equals to the sum of books purchased in all categories before the 20thof December 2018, and TotalSpnd equals to the sum of spending in all categories before the 20th of December 2018.
  2. Perform the following activities for the Campaign Input Data node:
    1. Change the Role and Level of CustID in Campaign Input Data node as Input and Interval respectively. Rerun the node.
    2. Connect the Campaign Input Data node to a Transform Variables node. Use the Transform Variables node to create a numeric variable, namely Mail, where Mail equals to 1 for all records in Campaign.
    3. Connect the Transform Variables node in Task6b to a Metadata node. Use the Metadata node to change the Role and Level of CustID back to ID and Nominal respectively.

AMD Data Mining代写

  1. Merge the data set exported from the Transform Variables node in Task 4d, the data set exported from the Transform Variable node in Task 5, and the data set exported from the Metadata node in Task 6cby the value of CustID. {Hint: Set the following properties of the Merge node: Set the Merge Role value of CustID in the Variable property as ‘By’; Set Merging property as ‘Match’; Select CustID to be the By Ordering variable.}
  2. Connect the Merge node in Task 7to a Filter node. Use the Filter node to remove records with Mail not equals to 1. No other records are to be removed. How many records are there in the exported data set of the Filter node? Type your answer in the SAS Code node created in Task 4a.
  3. Connect Filter node in Task 8to another Transform Variables node. Use the Transform Variables node to create a numeric variable named Respond such that if a customer’s SpndPrm value > 0 then Respond equals to 1, otherwise Respond equals to 0. {Hint: In the SAS Code property of the Transform Variable node, type these two statements: If SpndPrm > 0 then Respond=1; else Respond=0;}

 AMD Data Mining代写

  1. Connect the Transform Variables node in Task 9to a Metadata node. Set the New Role, New Level, and New Order of Respond as ‘Target’, ‘Binary’, and ‘Descending’ respectively. Briefly describe the distribution of Respond. Type your answer in the SAS Code node created in Task 4a.
  2. Connect the MetaData node in Task 10to a Data Partition node. Divide the data set exported from the MetaData node into training, validation, and testing partitions. Use 60% of the data for training, 20% for validation, and 20% for testing. Ensure the distribution of variable Respond is similar in each data subset.
  3. Connect the Data Partition node in Task 11to a Variable Selection node. Set the value of Use AOV16 Variables property in the R-Square Options Section of the node to ‘Yes’. Leave the setting of the other properties as they are. Run the node. Which variables are found to be associated with the variable Respond by the node? Type your answer in the SAS Code node created in Task 4a.

Steps to submit your project: Data Mining代写

  1. Close the project and quit SAS EM application.
  2. Zip the entire project folder, i.e. the Assignment1 folder.
  3. Submit the zipped file by following the Assignment 1 link under Assignments section in Canvas. As the file size is rather large, this process may take a while depending on the speed of the network. So be patience. The link will be closed at 11:55pm,24 February 2019. Allow yourself sufficient amount of time for the submission process. Late submission will not be accepted. If you make more than 1 submission, only the last submitted file will be marked.

 

更多其他:代写作业 数学代写 物理代写 生物学代写 程序编程代写  墨尔本pte代考

合作平台:天才代写 幽灵代  写手招聘  paper代写

发表回复