留学生计算机作业代写 – Basic Chatbot代写 – 建模代写 – COMP5046
留学生计算机作业代写

留学生计算机作业代写 – Basic Chatbot代写 – 建模代写 – COMP5046

COMP5046 Assignment 1 [20 marks]

Basic Chatbot

 

 

留学生计算机作业代写 Chatbots (Chat-oriented Conversational Agent) are designed to handle full conversations, mimicking the unstructured flow of a human ···

 

Chatbots (Chat-oriented Conversational Agent) are designed to handle full conversations, mimicking the unstructured flow of a human to human conversation. In this assignment, you are to implement a basic (chit-chat) chatbot using Sequence to Sequence (seq2seq) model and Word Embeddings. The detailed information for each implementation step was specified in the following sections. Note that lab exercises would be a good starting point for the assignment. The useful lab exercises are specified in each section.

1. Data Preprocessing for Chatbot [3 Marks]  留学生计算机作业代写

In this assignment, you are asked to use Microsoft BotBuilder Personality Chat Datasets, which includes three different personality chat (professional, friend, comics) datasets. With these datasets, you are to build our social chatbot and implement changing personality function – refer to the Section 3. In this Data Preprocessing section, you are required to implement the following functions:

 

Figure1. Microsoft BotBuilder Personality Chat Datasets

  • Preprocess data: You are asked to pre-process the chatbot training data byintegrating several text pre-processing techniques (tokenisation, removing numbers, converting to lowercase, removing stop words, stemming, etc.) – Please refer to Lab 5. You should justify the reason why you apply the specific preprocessing  [Justify your decision]

 

2. Model Implementation [7 Marks]  留学生计算机作业代写

In the ‘Model Implementation’ section, you are to implement two models, word embedding model and Sequence model. While the model is being trained, you are required to display the Training Loss and the Number of Epochs. You are free to choose hyperparameters (size of vector for embeddings, learning rate, etc.)

1) Word Embeddings [3 marks]

First, you are asked to build the word embeddings model for your chatbot as you will use word embedding (vector represents of words – such as word2vec-CBOW, word2vec-Skip gram, fastText) as input for seq2seq model. Note that we used one-hot vector as input for seq2seq model in the Lab4. In order to build the word embedding model, you are required to implement the following functions:

  • Download Data for Word Embeddings: This can be different from the datayou used in the section 1 (‘Data Preprocessing for Chatbot’). You can download and use any datasets to train the word embedding model: i.e. Microsoft BotBuilder chat datasetsCornell Movie Dialog Corpusetc. [Justify your decision]
留学生计算机作业代写
留学生计算机作业代写
  • Preprocess data for word embeddings: You are to preprocess data forword embeddings – refer to lab2 and lab3. This can be different from the preprocessing technique that you used in the section 1 (‘Data Preprocessing for Chatbot’) [Justify your decision]
  • Build training model for word embeddings: You are to build the trainingmodel for word  You are required to describe how hyperparameters (size of vector for embeddings, learning rate, etc.) Note that any word embeddings model (e.g. word2vec-CBOW, word2vec-Skip gram, fastText) can be applied.  refer to Lab02 (Gensim) Lab03 (Tensorflow) [Justify your decision]
  • Train model: You can use gensim package or implement with While the model is being trained, you are required to display the Training Loss and the Number of Epochs.
  • Save model: You are to save the trained word embedding model to yourGoogle Drive refer to lab5. Note that your assignment 1 will not be marked if you modify the model after the 
  • Load model:You are to implement a function to load the model, saved in your Google Drive.

 

2) Sequence Modelling [4 marks]  留学生计算机作业代写

Secondly, you are asked to build the Many-to-One (N to 1) Sequence model in order to train a chatbot. You must train three different individual sequence models (using three different personality datasets – refer to section 1) and those models should be individually applied. Note that your chatbot users should be able to change the personality (trained model) of chatbot – refer to section 3 ‘Evaluation’. You are required to implement the following functions:

  • Apply/Import Word Embedding model: You are to apply the trained wordembedding model to the sequence model
  • Build training sequence model: You are to build the Many-to-One (N to 1)sequence model –refer to lab4. You are required to describe how hyperparameters (the Number of Epochs, learning rate, etc.) were decided. [Justify your decision]

 

  • Trainmodel: While the model is being trained, you are required to display the

Training Loss and the Number of Epochs.

  • Save model: You are to save the trained sequence model to your GoogleDrive refer to lab5. Note that your assignment 1 will not be marked if you modify the model after the 
  • Load model: You are to implement a function to load the model, saved inyour Google Drive.

 

3. Evaluation (Running chatbot) [4 Marks]  留学生计算机作业代写

After completing all model training – refer to the section 1 and 2, you should apply the model to run the chatbot. Users should be able to chat with the trained chatbot. All chat log (chat history) needs to be printed on the console and saved in the ‘chat_log.txt’ file on your Google Drive. The following functions need to be implemented:

  • Start chat: You are to implement the function to execute the chat. Thisfunction should include all required functions to load the trained chatbot (download and load sequence model that you built in section 2).
  • Change personality: Users should be able to change chatbot personalityduring the conversation (runtime). The default personality should be “Professional”. The commands for personality change should be defined by you and implemented by using handcrafted rules (i.e. regular expression or exact string matching). The commands need to be written in the description
  • Figure2. Chat log format  exampleSave chat log: All chat log (chat history) needs to be printed on the consoleand saved in the ‘chat_log.txt’ file on your Google Drive. The final chatbotrequires to test more than more than 50 conversations. The chatlog file should be submitted. The following figure shows how the chat log has to be formatted.

  • End chatting: You should define the ‘end conversation’ command, and thecommand should be implemented by using handcrafted rules. The commands need to be written in the description

 

4. Documentation [4 Marks]  留学生计算机作业代写

In the section 1,2, and 3, you are required to describe justify any decisions you made for the final implementation. You can find the tag ‘[Justify your decision]’ for the point that you should justify the purpose of applying the specific technique/model.

For example, for section 1 (preprocess data), you need to describe which pre-processing techniques (removing numbers, converting to lowercase, removing stop words, stemming, etc.) were conducted and justify your decision (the purpose of choosing a specific pre-processing techniques, and benefit of using that technique or the integration of techniques for your chatbot) in your ipynb file

Figure3. The position of writing justification in ipynb file

  

 5. Programming (coding) styles [2 Marks]  留学生计算机作业代写

Your program needs to be easily readable and well commented. The followings are expected to be satisfied:

  • Readability:Easy to read and maintain
  • Consistency& Naming: Names are consistent in style
  • CodingComments: Comments clarify meaning where needed
  • Robustness:Handles erroneous or unexpected input

 

Assignment 1 Submission Method  留学生计算机作业代写

Submit a ipynb file that contains all above (1,2,3,4 and 5) contents.The ipynb template can be found in the Assignment 1 template. You also need to submit chatlog (txt file), word embedding model (zip file), and sequence model (zip file).

Due date: 5:00PM Monday 29 April 2019 Submission: Canvas Assignment 1 Submission Box Submission Files:

  • ipynbfile  (file name: your_unikeyipynb)
  • txtfile  chatlog (file name: your_unikeytxt)
  • zipfile  word embedding model (file name: your_unikeyzip)
  • zipfile  sequence model (file name: your_unikeyzip)

 

更多代写:美国CS代写 生物学代考 ANTH代写 2000字essay代写  paper代写 留学生数学代写

合作平台:天才代写 幽灵代  写手招聘  paper代写

发表回复