fprintf to txt too slow, anyway to accelerate it?

19 views (last 30 days)
hi:
I have data with different column, such as
a=ones(100,2), b=ones(50,2);
but I want to append them and export to a .txt file, so what I did is:
  1. construct a new matrix that has the size (max row of a&b, sum of the column of a&b), here is with the size: (100,4).
  2. for each row, find the data that equal to '1', and build a new row that set all the element '1' to ' ,', and save to .txt file using fprinf. however, I found that it is too slow.
below is my test code, and I also attatched the test file A.
load A
tic
fid = fopen('test.txt','w');
for i=1:1:length(A(:,1))
str={''};
for j=1:1:length(A(1,:))
if A(i,j)==1e5;
str=strcat(str,{' ,'});
else
str=strcat(str,sprintf('%0.5e',A(i,j)),',');
end
end
fprintf(fid,'%s\n',str{1});
end
fclose(fid);
toc
tic
dlmwrite('test_dlm.txt',A)
toc
here the 1e5 is the identifier that need to be set to ' ,'. the result shows that fprintf will cost about 129 seconds, while the dlmwrite cost only 12 seconds.
thanks!
Li
  1 Comment
dpb
dpb on 5 Mar 2017
If you'll profile the code, you'll find all the time is spent in the strcat operations, not in fprintf
Unclear to me what you're really wanting to do; if the point is to append the two datasets, why not just
a=ones(100,2), b=ones(50,2);
csvwrite('aplusb.csv',[a;b]
and be done with it instead of creating a file that's 100x4 elements instead of 150x2? If the point is to only put values in the accumlated file that are unique somehow to the two arrays, then do that processing first on the combined and then write.

Sign in to comment.

Accepted Answer

dpb
dpb on 5 Mar 2017
Edited: dpb on 5 Mar 2017
A=[a;b]; % append to one array for convenience
A(A==badValue)=nan; % replace the bum values with NaN
csvwrite('FixedUpCombined.csv',A) % write to csv file with missing value indicator
Much simpler than creating the specific format to write an empty field; let NaN serve as the placeholder instead. This will also be unequivocal as to what is bum data; "csvread fills empty delimited fields with zero." from the documentation.
Note, however, if you're adamant about using empty delimited fields, the way to go about it is to find the locations in the row and use those locations to build the proper format string with the appropriate number of fields of given type, then write the record using that format string. repmat is exceedingly useful in these machinations since unfortunately the C-like formatting strings cannot accept a repeat count.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!