what should be correct format of feature vectors matrix for feeding into neural networks?

1 view (last 30 days)
I am using MatlabR2012b version. m working on speech emotion classification, i have used MFCC for feature extraction and NNtoolbox for classification, but i am getting very high error rates (training error 23%, validation error 60%, testing error 80%). i tried various combinations of input matrix and target matrix but none helped me. a portion of my code for generating feature vector matrix is here:
mfcc=zeros(6000*13,size(filesToRead,1));
for j=1:size(filesToRead,1)
% Read speech samples, sampling rate and precision from file
[ speech, fs, nbits ] = wavread( filesToRead{j} );
% Feature extraction (feature vectors as columns)
[ MFCCs, FBEs, frames ] = mfcc( speech, fs, Tw, Ts, alpha, hamming, R, M, C, L );
for i=1:13
mfcc((i-1)*size(MFCCs,2)+1:i*size(MFCCs,2),j) = MFCCs(i,:);
end
clearvars MFCCs
end
*I have a total of 160 speech samples and eight different classes (20samples each). I have extracted MFCCs and it gives me a 13x5000 matrix for one sample. I want to feed these features for all 160 samples into NN and then classify into 8 classes. tell me stepwise:
# (1). in which format to store the feature vector matrix
# (2). how to arrange the extracted feature vectors (in rows or columns?)
# (3) Whether i need to create one single matrix for the features of all 160 samples?
# (4) How do i feed this matrix to NN and how many input neurons should i have?
# (5). which divide parameter should be used for dividng my data set into training, validation and testing sets. (i used dividerand and divided as 70-15-15 and also tried 60-20-20 and 70-20-10)
# (6) what should be my hidden layer function. (sigmoid, linear etc..)
# (7) What should be my target matrix?*
  1 Comment
sai susmitha
sai susmitha on 29 Dec 2016
what does 160 samples mean like 160 audio files ???and did u get 6000 frames from these 160 files ?? and finally did u get the answer

Sign in to comment.

Accepted Answer

Greg Heath
Greg Heath on 17 Feb 2014
For classification of c classes use N pairs of I-dimensional column vector inputs and O-dimensional outputs with O = c.. The outputs should be c-dimensional column unit vectors from the unit matrix eye(c).
For an example
[ x , t ] = simpleclass_dataset;
[I N ] = size(x)
[[O N ] = size(t)
whos
Use patternnet and accept all defaults.
help patternnet
doc patternnet
If results are satisfactory try reducing the number of hidden nodes to increase robustness with respect to unseen data.
If results are unsatisfactory, run 9 more times to vary the random initial weights.
If results are still unsatisfactory increase the number of hidden nodes and obtain 10 more designs with different initial weights.
Repeat until the best of 10 results stabilize.
For examples search
greg patternnet Ntrials
Hope this helps
Thank you for formally accepting my answer
Greg
  4 Comments
Greg Heath
Greg Heath on 19 Feb 2014
simpleclass_dataset (NOT CAPITALIZED) is a MATLAB data set used for demos and debugging code. To find all such data sets enter
help nndatasets.
[ x, t] doesn't apply to you because you only have x stored
  1. (1). in which format to store the feature vector matrix
Store them any way you want. Just convert to format long for the program
  1. (2). how to arrange the extracted feature vectors (in rows or columns?)
see my answer
  1. (3) Whether i need to create one single matrix for the features of all 160 samples?
Yep.
  1. (4) How do i feed this matrix to NN and how many input neurons should i have?
help patternnet
see my answer
  1. (5). which divide parameter should be used for dividng my data set into training, validation and testing sets. (i used dividerand and divided as 70-15-15 and also tried 60-20-20 and 70-20-10)
OK but reread my answer
  1. (6) what should be my hidden layer function. (sigmoid, linear etc..)
Use the defaults
  1. (7) What should be my target matrix?
see above
Your original use of abbreviations is not appreciated in this forum.
Neither is asking questions that have been already been answered.
kunwar
kunwar on 23 Feb 2014
thanx so much for your kind support and help, i will take care in future to not to violate the decorum of this forum. regards

Sign in to comment.

More Answers (2)

primrose khaleed
primrose khaleed on 22 Apr 2014
how can used sift to extraction features of my project....my peoject is under vehical scanner which looking for the forgine objects in under vehical ....pleas help me...how to khnow what is the namber of hidden layer can help me....how to feed the neural network by extraction features... .pleas help me
  4 Comments
primrose khaleed
primrose khaleed on 8 May 2014
hi , i have dataset tthat consist of 20 image of two different cars, i want to enter this image into neural network how can do that??? i resize the image into 200x200 ...how can start?? how to create input and targrt matrix???pleaze help me
sai susmitha
sai susmitha on 3 Jan 2017
I wanted to ask that when we extract mfcc's for a wave(audio) file then it gives a matrix of [no.of frames x 13]if each frame has 13 mfcc's now if to train a classifier using these(here Ann) then what is the input of BPNN classifier should i convert this matrix(what i got after extraction) into 1 x 13 then for each wav file we finally have only this one representation(1x13)and for training the target vector format would be [1 0 0] if i am classifying it into 3 classes each wav file i put target vector as 1 in the desired class and 0 in others if input is not 1x13 i.e (no.of framesx13) then what is target vector format

Sign in to comment.


Asif Khan
Asif Khan on 12 Jul 2014
hi all,
i am working on speech recognition using mfcc and have out put of 12 column(coefficient) and multiple rows(frames) . i have 80 speech sample of isolated words and rows(frames) of each sample is different how could i deal with different number of rows?

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!