Problem with renaming dublicated variablenames

1 view (last 30 days)
Hi experts,
I want to rename a bunch of variables (ppn) by extending them with '_01' and I want to extend duplicate variablenames with '_02'. The problem is that I get variables named 'BG1028_02_01'. I hope someone can help me!
ppn = {'BG1026';'BG1027';'BG1028';'BG1028';'BG1029';'BG1030'}
for i = 1:length(ppn)-1
j = i + 1;
d = strcmp(ppn{i},ppn{j});
if d == 0;
ppn{i} = strcat(ppn{i},['_01']);
else d = 1;
ppn{i} = strcat(ppn{i},['_01']);
ppn{j} = strcat(ppn{j},['_02']);
continue;
end
end
This is what I get:
ppn =
'BG1026_01'
'BG1027_01'
'BG1028_01'
'BG1028_02_01'
'BG1029_01'
'BG1030'
  1 Comment
Kirby Fears
Kirby Fears on 16 Sep 2015
Edited: Kirby Fears on 16 Sep 2015
Do you want a general solution in case you have more than 2 duplicates? In the future, could you end up having 99 duplicates in a row?

Sign in to comment.

Accepted Answer

Kirby Fears
Kirby Fears on 16 Sep 2015
Edited: Kirby Fears on 16 Sep 2015
As long as you expect no more than 99 duplicates, this should solve your problem. You can remove the sort() call if your names will arrive sorted in the first place.
% Example list of names
ppn = {'BG1026';'BG1026';'BG1027';'BG1028';'BG1028';'BG1028';'BG1028';'BG1029';'BG1030'};
% Ensure that list is alphabetically sorted
ppn = sort(ppn);
% Append ones to everything
ppn = cellfun(@(s)[s '_01'],ppn,'UniformOutput',false);
% Check if any string is equal to previous string
idx = find(strcmp(ppn(1:end-1),ppn(2:end)))+1;
% Loop to rename variables
counter=2;
while ~isempty(idx) && counter<100,
% Turn duplicate counter into 2-digit string
counterstr=num2str(counter);
if length(counterstr)==1,
counterstr = ['0',counterstr];
end,
% Replace all duplicates with new counterstr
for iter=1:numel(idx),
ppn{idx(iter)}(end-1:end)=counterstr;
end,
% Update duplicate idx for next round of loop
idx = idx(strcmp(ppn(idx),ppn(idx-1)));
counter = counter+1;
end,
  3 Comments
Kirby Fears
Kirby Fears on 16 Sep 2015
The code above assumes that any individual name will not appear more than 99 times. E.g. BG1028 should not appear more than 99 times.
However, any number of variables (528 in your case) can each have up to 99 duplicates.
Marty Dutch
Marty Dutch on 16 Sep 2015
Okay, I understand it now. The number of duplicates is no more than 5. Thanks for your time and help!

Sign in to comment.

More Answers (1)

dpb
dpb on 16 Sep 2015
ppn=strcat(ppn,'_01'); % add the '_01' at the git-go...
[~,~,ib]=unique(ppn); % get a list of unique name positions
[n,ix]=histc(ib,1:5); % count and locate if any duplicates
for i=1:length(n) % and process the list and fixup if needed...
if n(i)>1
for j=2:n(i)
k=ix(i)+j-1;
ppn(k)=strrep(ppn(k),'_01',num2str(j,'_%02d'));
end
end
end
BUT, DO NOT DO THIS!!! See the FAQ for why not and ways to avoid same...

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!