Cannot import large .dat files

6 views (last 30 days)
Batuhan
Batuhan on 19 Feb 2014
Commented: Batuhan on 7 Mar 2014
Dear All,
I have some huge data files, containing around 15000000x2 matrices in seperate files. That is, one of my files contains a 15000000x2 matrix. When I import it into Matlab using importdata, only a fraction of it is imported. It seems there is a problem with the memory, but could anyone can help me to solve this?
I am using Dell latitude with 8GB memory, core i5.
Cheers
  2 Comments
per isakson
per isakson on 19 Feb 2014
>> 15000000*2*8/1e6
ans =
240
That is 0.24GB, which shouldn't be a problem. At least not with a 64bit Matlab.
"only a fraction of it is imported" make me believe there is a problem with the file.
You must provide more detail. Error message. The ten first lines of the file.
Batuhan
Batuhan on 4 Mar 2014
Nope, there are no errors in the file. At least I can open it with vim.

Sign in to comment.

Accepted Answer

per isakson
per isakson on 5 Mar 2014
Edited: per isakson on 5 Mar 2014
Comments:
  • "[...]can open it with vim." That is not a sufficient requirement! There could be a special string indicating a missing numerical value in a line in the middle of the file. There could be a spurious non-printing character somewhere. Possibly, vim understands more text encodings than does importdata of Matlab.
  • AFAIK: importdata analyzes (auto-magically) the beginning of the file to find out the format and delegates to an appropriate function. There might be a line which don't honor the format of the lines in the beginning.
  • If you know the format it's better to use textscan.
  • Wasn't there an error message?
  • You must provide more detail. Error message. The ten first lines of the file.
  3 Comments
per isakson
per isakson on 5 Mar 2014
Edited: per isakson on 5 Mar 2014
Strange! I don't know. Guessing:
  • Something happened when you copied the file to the laptop. Did you check the size of the file on the different computers?
  • AFAIK: there is no default value or preference, which could cause this.
  • With importdata one may set "Range:". Did you by mistake use an old value?
However, there are many Matlab function, which should read that file, e.g. textscan and load. (The flexibility of textscan is not needed.)
>> fid = fopen( 'cssm.txt', 'r' ); % see doc for different encodings
>> buf = textscan( fid, '%f%f', inf, 'CollectOutput', true );
>> fclose(fid);
>> M = load( 'cssm.txt', '-ascii' );
where cssm.txt contains the data from your comment (copy&paste).
You might test your file for spurious characters. Read as characters and search for characters other than, digit, dot, "white space", plus and minus. Empty cac indicates that none was found.
>> str = fileread( 'cssm.txt' );
>> cac = regexp( str, '[^\d\.\s\+\-]', 'match' );
>> whos
Name Size Bytes Class Attributes
M 30x2 480 double
ans 1x1 8 double
buf 1x1 592 cell
cac 0x0 0 cell
fid 1x1 8 double
str 1x807 1614 char
Batuhan
Batuhan on 7 Mar 2014
Dear Per,
Thank you for advice. Indeed 'load' function works much better and it loads the whole file without loss.
I am using Matlab on Linux Mint 13, maybe the problem is related with that.
The bottom line is; load function is better. Thank you for your help.
Batuhan

Sign in to comment.

More Answers (0)

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!