How to use textscan to read data with missing values?

15 views (last 30 days)
I found it is not possible to use textscan to import a .txt file while the data itself contains missing values.
So for example, I have a test.txt, where * represents missing data(empty) and the delimiter is just whitespace:
1 2 3 4
5 6 A *
7 8 A *
9 * * 10
Is there any exports can help me?? Thanks a lot!

Accepted Answer

Jan
Jan on 6 Jun 2011
Does the file contain '*' as markers for empty value, or did you insert the stars here for display purposes only?
If there are no values in the real file, if a value is missing:
Data = textscan(FID, '%f%f%f%f', 'Delimiter', ' ', 'EmptyValue', Inf);
If there are stars in the file, you could use an intermediate step:
Str = fileread(FileName);
Str = strrep(Str, '*', '');
Data = textscan(Str, '%f%f%f%f', 'Delimiter', ' ', 'EmptyValue', Inf);

More Answers (2)

the cyclist
the cyclist on 5 Jun 2011
There is a file in the FEX called "readtext" that should handle this situation:

Zoe
Zoe on 6 Jun 2011
Thanks for all!
And the problem solved. The delimiter is actually tab. My answer is: Data = textscan(FID, '%f%f%f%f', 'Delimiter', '\t', 'EmptyValue', 0);
And it totally works now!
However, if the delimiter is just whitespace (though rare), I still don't think textscan can handle it. Logically yes, but it seems to confuse the missing value with the delimiter when you are importing a .txt file.
  4 Comments
Christopher Conatser
Christopher Conatser on 27 Sep 2016
To extend this question further...I have a similar problem, but the (utterly malicious!) text export function for my instrument also has an irregular number of spaces between different columns, and they also vary depending on number of significant figures. Jan, do you (or anyone else) have any suggestions for dealing with it?
Sample tables (already cleaned up considerably):
SAMPLE BOTTLE TIME SOURCE ERROR LIQUID
------- ------ ---- -- -- ------
1,10 1 15:38 F 320
2,10 1 15:42 F 306
3,10 1 15:48 F 310
SAMPLE BOTTLE TIME SOURCE ERROR LIQUID
------- ------ ---- -- -- ------
1,5 13 22:41 F 198
2,5 13 23:35 F NM *
3,5 13 00:40 F NM *
4,5 13 01:04 F 196
No matter what I've tried (tab delimiter, space delimiter, multispace literals in the formatSpec, setting 'MultipleDelimsAsOne' to false,...) everything skips the "ERROR" column when it is empty.
formatSpec = ' %f,%*f %f %{HH:mm}D %s %s %f';
Thanks for your help!

Sign in to comment.

Categories

Find more on Data Import and Export in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!