How to retrieve images with the same name from multiple folders?

2 views (last 30 days)
I have nine different folders, each containing hundreds of images. I suspect there are a few images with the same names that are present in two or more folders. How do I retrieve only those images that are present in these nine different folders?

Answers (1)

Walter Roberson
Walter Roberson on 12 Jul 2018
Untested.
projectdir = '/Users/siv/PhD/part11';
dinfo = dir(projectdir);
dinfo(~[dinfo.isfolder]) = []; %remove non-folders
dinfo( ismember({dinfo.name}, {'.', '..'}) ) = []; %delete . and ..
subdirnames = {dinfo.name};
fqsubdirnames = fullfile(projectdir, subdirnames);
numsubs = length(subdirnames);
subinfo = cell(numsubs, 1);
for K = 1 : numsubs
thisinfo = dir(fqsubdirnames{K});
thisinfo( [thisinfo.isdir] ) = []; %remove folders. This includes . and ..
subinfo{K} = {thisinfo.name};
end
allnames = horzcat(subinfo{:});
numents = cellfun(@length, subinfo);
group_breakpoints = cumsum([1; numents(:)]) .' ;
[unames, ~, uidx] = unique(allnames);
counts = accumarray(uidx, 1);
dupidx = find(counts > 1);
if isempty(dupidx)
fprintf('No duplicate names!\n');
else
fprintf('Some duplicates found!\n\n');
for K = 1 : length(dupidx)
orig_idx = find(uidx == dupidx(K));
[~, folderidx] = histc(orig_idx, group_breakpoints);
fprintf('file "%s" found in the following folders:\n', unames{dupidx(K)} );
fprintf(' %s\n', subdirnames{folderidx} );
end
fprintf('\n');
end
  5 Comments
Image Analyst
Image Analyst on 13 Jul 2018
Images can have the same name and not be duplicates so you might want to check the dates, CRC, etc. Just construct the output folder name, full file name, and then call imwrite():
Something like
outputFolder = '/Users/siv/PhD/part11/Duplicate Images' % Wherever you want.
baseFileName = sprintf('duplicate_%s', originalBaseFileName);
fullFileName = fullfile(outputFolder, baseFileName);
imwrite(theImage, fullFileName);
Obviously make needed variable name changes as appropriate.
Sivaramakrishnan Rajaraman
I'm running your main code that prints the "unames{dupidx(K)}" and the sub directory names "subdirnames{folderidx}". In that context, if I'm to use to above code into the last for loop of your code, what should be the originalBaseFileName?

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!