Setting sample weights for training of network to set the contribution of each sample to the network outcome

6 views (last 30 days)
What I need to do is train a classification network (like Pattern Recognition Tool) where each sample would have a different weight. The contribution of a sample to the network error would be proportional to its weight.
For example, given samples with higher and lower weights; after training the network would classify the samples with higher weights with a more success while sacrificing some correct classification of the samples with lower weights.
Does anyone know how to do this?
Currently my only idea on how to achieve this goal would be: For each iteration of a loop: 1. randomly assemble a subset of samples with a chance of picking a sample proportional to its weight. 2. train for 1 epoch

Accepted Answer

Greg Heath
Greg Heath on 28 Apr 2013
You will have to go thru those 5 BioID threads. I can't remember the details.
However, if the ordinary classification scheme is to have columns of eye(c) for targets, then multiplying the target for a single vector by a weight greater than 1 will improve its correct classification performance. In addition, if logsig or softmax is used, the estimated posterior will always be less than 1.
I haven't weighted single vectors, just classes.
Greg.

More Answers (3)

Greg Heath
Greg Heath on 27 Apr 2013
Notation: The term sample implies a group of data, not a single case or measurement.
Use patternnet with 'logsig' or 'softmax' as the output transfer function
For c classes use a target matrix that has columns of the c-dimensional unit matrix eye(c).
The relationships between the target matrix, integer (1:c) class index row vector, integer assigned class row vector, {0,1} error vector etc. are
target = ind2vec(classind);
classind = vec2ind(target) % integers 1:c
net = train(net, input, target);
output = net(input);
assigned = vec2ind(output)
errors = (assigned ~= classind )
Nerr = sum(errors)
Individual class performances are obtained using unique vector (NOT class) indices (1:N). If class performances are unsatisfactory,several measures can be used. For example
1. Weight the input matrix
2. Weight the target matrix
3. Weight the output matrix
4. Add noisy duplicates of poorly classified vectors to the input matrix.
I've forgotten the details. However, in Mar-May 2009 (5 threads) I did post results of comparing my choice of the duplication method with others for BioID classification
Search the Newsgroup using the searchword
BioID.
Hope this helps.
Thank you for formally choosing my answer
Greg
  1 Comment
Ferenc Raksi
Ferenc Raksi on 27 Apr 2013
Edited: John Kelly on 24 Jun 2021
The situation here is slightly different than with BioID. The issue is not of underrepresented classes. The problem is that every "measurement" has a different and significantly varying cost of classification. In fact, it is the case that with 2 classes the optimal classification may be such that the correct classifications are below 50%.
Using 'logsig' or 'softmax' as the output transfer function does not help, as each measurement still has the same classification cost.
How would you weight the input matrix? Can you elaborate on 1. 2. and 3.? I hope you're not referring to nnproperty.net_inputWeights as that does not discriminate between measurements.
4. is what I am trying to avoid :)
Thanks, Ferenc

Sign in to comment.


Greg Heath
Greg Heath on 29 Apr 2013
I don't really believe in twiddling the net to accommodate a few isolated inputs that cannot be classified correctly.
BEFORE considering a neural network
1. Plot the data
2. Check the data for
a. Errors
b. Outliers
3. a. Correct and/or remove errors
b. Modify and/or remove outliers
c. Weight inputs and/or add noisy duplicates to equalize training priors
4. Design and test a Linear (SLASH) Model
5. Design and test NN Models
6. Apply class output weighting to optimize risk based
on operational priors and misclassification costs
  2 Comments
Ferenc Raksi
Ferenc Raksi on 30 Apr 2013
We're not talking a few isolated inputs. I am not twiddling outcomes. I am setting up a system to correctly solve for what I am looking for. The reason for the weights is that it mathematically give the most accurate solution of what I am trying to solve for. I use a classification type neural network primarily because of the binary nature of the decisions made. Solving directly for the decision made and not some irrelevant intermediate result will yield more accurate results with this type of data. In fact it takes the twiddling out and makes the process of finding a correct solution much more systematic. The weights assigned correspond to the reward if the right decision is made.
I have implemented the single epoch training with new input data built every loop based on the weights. Performing it this way may even help shake the training out of the local minima, not to mention how easy it is to add noise every epoch.
Preliminary results are good, and computation time stayed fairly good.
preksha pareek
preksha pareek on 27 Oct 2019
Considering my weights of neural network as mean of each feature then how can I change size of my weight matrix.
For example: I have 60 features and I am trying to find mean data from each sample of each of these 60 values so the matrix is 1*60, which I want to use as weight matrxi for initialization.
Whereas for neural network, weight matrix will have size corresponding to (input*hidden layer) so how I can reshape my matrix to fit as weight matrix?

Sign in to comment.


Greg Heath
Greg Heath on 1 May 2013
I have found (not only with BioID) that the best way to approach the problem is to weight and/or duplicate so that training priors are balanced. Then you can apply the Bayesian Risk Formula to satisfy any combination of misclassification costs and operational priors.
Hope this helps.
Greg

Categories

Find more on Deep Learning Toolbox in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!