How does one create an array of strings in a loop? In a better way.

39 views (last 30 days)
So I have an array of characters, separated by whitespaces, and obviously matlab recognizes whitespaces as a character too. I wanted to separate the words into strings and put them into one array so that calling a certain index would refer to the entire word. This was my code to do so.
if true
% code
end
File1 = fopen('History2.txt');
words = fscanf(File1 , '%c');
num_words = 1;
i = 1;
new_words = char(100,1);
while(1)
if words(i) ==' '
num_words = num_words+1;
else
new_words(num_words, i) = words(i);
end
if words(i+1)=='9'
num_words = num_words - 2;
break;
end
i=i+1;
end
if true
% code
end
Now, this does the job as in I can refer to the word by saying new_words(1, :). But, the way it displays is it as such :
OUTPUT: Hello, history tells us that history ... ... and so on. Moreover, when I want to compare two of the strings, using strfind like this:
if ~(isempty (strfind ( new_words(2,:) , new_words(6,:) ) ) )
disp('yoyo');
else
disp('nono');
end
It always displays 'nono'. What is a better way to accomplish this task so that I can perform the comparison between the strings to find the unique words in the paragraph?
Maybe using cell arrays? But how would I do that, given the words?
I even tried using strsplit, but this wouldn't work on the array words.
  9 Comments
Marc
Marc on 19 Oct 2013
Cedric
If you type edit strsplit in 2013a/b check line 80 and 83. It looks like this:
if ~isString(str)
error(message('MATLAB:strsplit:InvalidStringType'));
end
if isString(aDelim)
aDelim = {aDelim};
elseif ~isCellString(aDelim)
error(message('MATLAB:strsplit:InvalidDelimiterType'));
end
This is straight from the Matlab toolbox... (Matlab/toolbox/Matlab/strfun) So I am not sure what you are talking about with respect to isString... Although I agree that if I type edit isString I get an error that it is not on any path. So, you got me why this function works. I chalk it up to Matlab Magic.
Line 119 and 120 calls regexp... Not sure why you think that strsplit would be faster since it calls regexp... This line was also how I fell into the regexp route.
% Split.
[c, matches] = regexp(str, aDelim, 'split', 'match');
Cedric
Cedric on 19 Oct 2013
Edited: Cedric on 19 Oct 2013
Ah! My mistake, 2013a/b have STRSPLIT but not the 2012b on my laptop. Yet, ISTRING doesn't seem to be a built-in. Isn't it an internal function in STRSPLIT? I can't check now but I could check on Monday.
STRFUN is not a regular\base MALTAB toolbox (available separately here).
Your last point about STRSPLIT calling REGEXP is why I wrote "good implementation" in my previous comment. Well, "good" is a question of point of view, but what I had in mind was that an easy implementation is to write STRSPLIT as a wrapper for REGEXP (which is what you discovered), and that a "good" implementation would be more specific and efficient. I mentioned the difference in efficiency between STRREP and REGEXPREP to illustrate this point: an easy implementation of STRREP would be a wrapper for REGEXPREP, but the MATLAB implementation of STRREP if more specific and efficient:
>> testStr = repmat('AB ', 1, 1e5) ;
>> tic ; strrep(testStr, 'A', 'CC') ; toc
Elapsed time is 0.002794 seconds.
>> tic ; regexprep(testStr, 'A', 'CC') ; toc
Elapsed time is 4.176688 seconds.

Sign in to comment.

Answers (1)

Iain
Iain on 7 Jun 2013
Once you have identified your word, you can simply put it in an element of a cell array thus:
cell{index} = word;
unique(cell) will then give you all of the unique strings.

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!