importing large files with text/numbers

3 views (last 30 days)
John
John on 25 Jul 2014
Commented: dpb on 25 Jul 2014
Hi,
I am trying to import a text (not binary) data file into Matlab. It has fields of text and numbers arranged in lines. Each line may have a different numbers of fields, but 23 is about the maximum. The fields are comma separated and each line ends with CR/NL.
xlsread() works, if the file is small, say 100 lines. Then I get a cell array with 100x23 entries.
This is what Matlab uses: [~, ~, untitled] = xlsread('JB100.csv','JB100'); untitled(cellfun(@(x) ~isempty(x) && isnumeric(x) && isnan(x),untitled)) = {''};
However, my files can be over 1 million lines long and I seem to run out of memory at around 100,000 lines. [Also if memory were not an issue I think there would be a limit of 1048576 lines as in Excel.]
How do I get around this issue?
Thanks, John
  2 Comments
per isakson
per isakson on 25 Jul 2014
Attach a sample file, which includes all types of lines.
How do you want the result?
dpb
dpb on 25 Jul 2014
[~, ~, untitled] = xlsread('JB100.csv','JB100');
untitled(cellfun(@(x) ~isempty(x) && isnumeric(x) && isnan(x),untitled)) = {''};
This way is reading the full file as text and then trying to do something with it. I'd suggest using the first two return variables instead to convert the numeric data to internal representation during the read process--this is bound to reduce memory requirements, particularly since to get the numeric values from the input text you're going to require both being in memory simultaneously.
It would be even better if you could dispense with the text data altogether.

Sign in to comment.

Answers (0)

Categories

Find more on Large Files and Big Data in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!