Track changes in a cell array with strings

10 views (last 30 days)
I have a cell array with 5 columns and about 5000 rows with string elements. E.g.:
1997 Charles House Materials Chemicals
The years go from 1997 to 2013 and you find repetitions while the years pass by. What I would like to get is a new cell array with the cases in which the combination of the second and third column change. E.g. original cell array:
1997 Charles House Materials Chemicals %initial
1997 Rita Office Financial Bank %initial
1998 Rita Office Financial Bank %no change
1999 Charles House Materials Chemicals %no change
2000 Charles Office Materials Chemicals %change in the 2nd column
2001 Charles Office Materials Chemicals %no change
2003 Rita Star Financial Bank %change in the 2nd column
2005 Charles Castle Materials Chemicals %change in the 2nd column
2010 Rita Moon Financial Bank %change in the 2nd column
I would like for my new array to give me the first/original row and the cases in which you observe a change. E.g. output:
1997 Charles House Materials Chemicals
1997 Rita Office Financial Bank
2000 Charles Office Materials Chemicals
2003 Rita Star Financial Bank
2005 Charles Castle Materials Chemicals
2010 Rita Moon Financial Bank
My problem is mainly related to the fact that I am dealing with strings. If someone could help me I would appreciate. Thanks a lot for your availability.

Accepted Answer

Cedric
Cedric on 24 May 2014
Edited: Cedric on 24 May 2014
Here one way to proceed:
[~,ia] = unique( arrayfun( @(r)[C{r,2:end}], 1:size(C,1), 'Unif', 0 )) ;
C_chg = C(sort(ia),:)
which outputs
C_chg =
[1997] 'Charles' 'House' 'Materials' 'Chemicals'
[1997] 'Rita' 'Office' 'Financial' 'Bank'
[2000] 'Charles' 'Office' 'Materials' 'Chemicals'
[2003] 'Rita' 'Star' 'Financial' 'Bank'
[2005] 'Charles' 'Castle' 'Materials' 'Chemicals'
[2010] 'Rita' 'Moon' 'Financial' 'Bank'
when applied to your sample.
  2 Comments
Cedric
Cedric on 24 May 2014
Edited: Cedric on 24 May 2014
Note that this solution assumes that there is no case where an entry can be obtained by concatenation of several others in other rows. To illustrate, the two following entries are equivalent for my solution above.
[1997] 'Charles' 'A' 'AAA' 'AA'
[2000] 'Charles' 'AAA' 'AA' 'A'
I don't beleive that this case can happen, but if you want to avoid this issue, use
[~,ia] = unique( arrayfun( @(r)sprintf('%s ',C{r,2:end}), ...
1:size(C,1), 'Unif', 0 )) ;
instead of the first line.
Maria
Maria on 27 May 2014
Thank you very much. Your code works perfectly!

Sign in to comment.

More Answers (0)

Categories

Find more on Chemical Process Design in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!