北美留学生代写-EE 660代写-代写Homework
北美留学生代写

北美留学生代写-EE 660代写-代写Homework

EE 660

Homework 4 (Week 12)

北美留学生代写 The testing data DN in the target domain has DTest = 2000 points. Throughout this problem, we keep all feature dimensions in SA.

1.In this problem, you will test the subspace alignment (SA) algorithm against supervised learning (SL) methods on 2D synthetic data.  北美留学生代写

Data in the source domain Xs is a mixture of Gaussian:

Ps(x|y = 1) is Gaussian with mean and covariance Ps(x|y = -1) is Gaussian with mean , and covariance ,

Ps(y = 1) = Ps(y =−1)=0.5.

Data in the target domain XT follows the same distribution except that it rotates axes from source domain by θ. An example of the source data and target data with θ = 30° is visualized as below.

北美留学生代写
北美留学生代写

The sampled source domain data Ds has Ns = 1000 points, and the sampled target domain data DT= 1000 points. The testing data DN in the target domain has DTest = 2000 points. Throughout this problem, we keep all feature dimensions in SA.

Lots of the functions needed have been provided to you in the assignment material. You will implement the TO-DO parts in the code and answer the questions below. Note that you will always call the experiment function to get results for a certain setting, but feel free to add your own sections for plotting and so on. Do not remove or change the random seed setting in that function so that your answers align with our solution.   北美留学生代写

You may read through the whole problem before you read the code. Try to get a good understanding of the provided code before implementing your own part.

(a) Use the provided generate_data function to draw the datasets for θ from 0° to 180° with a step size of 30°. For each θ, visualize Ds and DT on the same plot. You may use the same or similar notations as the picture shown above. Include the plots in your answer.

Note: visualize the data before doing standardization on each feature dimension, i.e., in the original feature domain.

(b) The default steps of SA are as follows. 北美留学生代写

(i)Use an LDA classifier as Q. For θ from 0° to 180° with a step size of 30° (the datasets you draw in (a)), do SA training (as above) and also supervised training (in the original source domain, using standardized data) using the same base classifier.Test your trained SA and SL classifiers on DTest.Draw the plot of testing accuracy vs. θ for SA and SL on the same figure.

(ii)Repeat (i) for Q=Linear SVM

(iii)Repeat (i) for Q=kNN with k = 31.

(iv)Comment on your results. Which performs better, SA or SL?

北美留学生代写
北美留学生代写

Tip for (i)-(iv): your answers might not be what you expected; that doesn’t mean your results are wrong.

Hint: Search for TO-DO in the provided code and fill in those parts. Feel free to use the provided plotting function or create your own. Note that SL algorithm should also work on standardized data.

(c) You may have found that, when SA gives testing accuracy lower than 0.5, the classifier learned something but is predicting the labels backwards. We can improve the performance by considering the PCA step of the SA algorithm, as applied to the target domain. This PCA step results in a new coordinate system XT, defined by eigenvectors of the target data.These eigenvectors have an ambiguity, because if vi is an eigenvector, then −vi is also an eigenvector. Which sign is used for a basis vector can depend on the implementation of PCA, and can affect the resulting classification accuracy. 北美留学生代写

We can remove this ambiguity if we have a small amount of labeled target domain data DTL.To determine whether to flip the sign of each axis of XT , the algorithm can check the resulting accuracy on the labeled target data. Use this “sign flip” feature with the algorithm for the parts below.

The number of labeled target domain data is NTL= 20.

(i)Use SA with the sign flip feature. Train and test the SA method on different θ.

(ii)For fair comparison, the training data of your SL counterpart should also contain DTL.Train and test the SL classifier on different ?(All in the original feature space, using standardized data.)

(iii)Draw plots of testing accuracy vs. θ for SA and SL on the same figure. Try the three Qs as in (b).

(iv)Which performs better, SA or SL? Does sign flipping and/or adding labeled target data help improve the performance of SL or SA? Why?

Note: For SA, just use DTL, for sign flipping, and the base classifier is still trained on source domain data only. For SL, add DTL, into the training set. What if we also add DTL, into the training set for Q in SA? Try it yourself and no need to report the results here.

(d) Use the algorithm of (c) and try NTL, in {0, 6, 20, 50, 100} for Q=LDA.

(i)Draw the plot of SA testing accuracy vs. θ for different NTL, on the same plot. 北美留学生代写

(ii)Draw the plot of SL testing accuracy vs. θ for different NTL, on the same plot.

(iii) How does NTL, affect the final performance of SA and SL? Why?

(e) Suppose θ = 30°,NTL= 20, Q=LDA. Now we would like to see whether standardization makes a difference. For sign flipping, use the algorithm of (c); without sign flipping, use the algorithm of (b); in both cases, implement with and without standardization.

Report the testing accuracy of SA and SL, and put your testing accuracies in the table below.How does standardization affect the accuracy? Why?

Sign flipping Standardization SA SL
×
× ×
×

2.This problem concerns the generalization error bound in a transfer learning problem, as given in Lecture 14, Eq. (6). 北美留学生代写

In this problem you will study the effects of varying Ns,NT, and a on the cross-domain generalization error bound.

Throughout this problem, let ε-. be everything in the cross-domain generalization-error bound (RHS of Lecture 14 Eq. (6)), except omitting e*S,T. Note that e*S,T is a constant of the parameters we will be varying.

Also throughout this problem, use the values dvc= 10, δ = 0.1,dh△h = 0.1 . However,leave them as variables until you are ready to plot, or until you are asked for a number.

(a) Give the simplified number (to two decimal digits) for εαβ, for the following cases:

(i)NT= 1, Ns = 100, α = 0.1, 0.5, 0.9

(ii) NT = 10, Ns = 1000, α = 0.1, 0.5, 0.9

(iii) NT = 100, Ns = 10000, α = 0.1, 0.5, 0.9

(iv) NT = 1000, Ns = 100000, α = 0.1, 0.5, 0.9

Tip: put these in a table for easy viewing.

(v) Do any of these sets of numbers assure some degree of generalization (i.e., εαβ <0.5, assuming e*S, ≈ 0)? If so, which?

Comment: As in the supervised learning case, these bounds can be very loose, but evidence indicates the functional dependence of εαβ. on its variables still generally apply.

(b) For this part, let Ns = 1000 and plot εαβ. vs. a for NT =  10,  100,  1000,  10000 (4 curves on one plot), over 0 ≤ ? ≤ 1. Answer: what approximate value of a is optimal for each value of NT? Try to explain the dependence of εαβ. on a for different values of εαβ, and any difference in optimal values of α.

(c) For this part, let NT = 100 and plot εαβ. vs. α for Ns =  10,  100,  1000,  10000 ( 4 curves on one plot), over 0 ≤ α ≤ 1. Answer: what approximate value of a is optimal for each value of NT? Try to explain the dependence of εαβ. on a for different values of Ns, and any difference in optimal values of α.

(d) Common default values for a are α = 0.5 and α = β. 北美留学生代写

(i) In terms of minimizing the cross-domain generalization-error bound, which default choice looks better (based on your answers to (b) and (c) above)? Is that choice reasonably consistent with your results of (b) and (c)?

(ii) Give algebraic expressions for ε-.(α = 0.5) and ε-.(α = β). Compare them algebraically: can you draw any conclusions about which is lower?

(iii) Plot εαβ(α = 0.5) vs. N for β = 0.01, 0.1, 0.5, for 1000 ≤ N ≤ 100000 (3 curves on 1 plot). Repeat for εαβ(α = β). What conclusions can you draw from the plots?

(e) In this problem, do the generalization-error bounds you have calculated typically relate to training set error or test set error? Briefly justify your answer.

 

更多代写:大学网课代考  北美托福代考  本科网课代上  北美网课代写essay润色  北美留学论文代写  论文润色费用

合作平台:essay代写 论文代写 写手招聘 英国留学生代写

北美留学生代写
北美留学生代写

发表回复