How to remove header in .txt file while retaining format of data
Show older comments
Hello everyone,
I have a .txt file with 12 lines of a header that is useless information to me. Is there a way to remove the header so that I am left with only data?
dlmread does not work because I have dates to read, csvread is producing an error, and textscan seems to provide the data in an altered format.
This is the code I am using to retrieve the data so far:
A = textscan(FID,'%s','headerLines',12)
This is what the data looks like when it is output:
'3298602123321'
'1'
'2011-09-20'
'01:00:00'
'10.8554401397705'
'3298602123321'
'1'
'2011-09-20'
'01:15:00'
'10.8555603027344'
This is what I need the data to look like:
3298602123321 1 2011-09-20 01:00:00 10.8554401397705
3298602123321 1 2011-09-20 01:15:00 10.8555603027344
Any help will be greatly appreciated, thanks for taking the time to consider it. Thanks in advance for any posts!
~Sarah (:
Answers (3)
Fangjun Jiang
on 30 Sep 2011
To read it directly from the text file and got the format you want, you can do
FID=fopen('test.txt','rt');
A = textscan(FID,'%f %d %s %s %f');
fclose(FID);
format long
Sensor_ID=A{1}
Point_ID=A{2}
SampleDate=A{3}
SampleTime=A{4}
SampleValue=A{5}
Or you can use reshape()
A={'3298602123321'
'1'
'2011-09-20'
'01:00:00'
'10.8554401397705'
'3298602123321'
'1'
'2011-09-20'
'01:15:00'
'10.8555603027344'}
B=reshape(A,5,[])';
4 Comments
Sarah
on 30 Sep 2011
Fangjun Jiang
on 30 Sep 2011
Yes. I didn't realize the IDs have such a long string of digits. Use the updated code should get what you want. Use format long to see the numerical data.
Sarah
on 6 Oct 2011
Fangjun Jiang
on 6 Oct 2011
Edited: Walter Roberson
on 31 Mar 2016
What do you mean? Copy the four lines below to test.txt and run the code.
3298602123321 1 2011-09-20 01:00:00 10.8554401397705
3298602123321 1 2011-09-20 01:15:00 10.8555603027344
3298602123321 1 2011-09-20 01:30:00 10.8555603027344
3298602123321 1 2011-09-20 01:45:00 10.8560495376587
I got:
Sensor_ID =
1.0e+012 *
3.298602123321000
3.298602123321000
3.298602123321000
3.298602123321000
Point_ID =
1
1
1
1
SampleDate =
'2011-09-20'
'2011-09-20'
'2011-09-20'
'2011-09-20'
SampleTime =
'01:00:00'
'01:15:00'
'01:30:00'
'01:45:00'
SampleValue =
10.855440139770501
10.855560302734400
10.855560302734400
10.856049537658700
Walter Roberson
on 30 Sep 2011
Corrected:
A = textscan(FID,'%[^\n]','headerLines',12);
4 Comments
Sarah
on 30 Sep 2011
Walter Roberson
on 30 Sep 2011
Please see corrected version above.
Sarah
on 6 Oct 2011
Walter Roberson
on 6 Oct 2011
This contradicts what you wrote earlier,
"What I need the data to look like is the way that it is in the .txt file before accessing it. I only want the header gone and the data to remain unchanged."
The code I supplied skips the header and reads in everything else as strings *exactly* the same way, space for space, character for character, as appears in the file.
If you want the data in the file split up into different variables, then you need to tell us what data type you want for each column, and you need to understand that unless you are wanting to split in to strings only, that the whitespace (blanks) might change, which would leave data that is *not* "the way it is in the text file before accessing it".
lpetley
on 31 Mar 2016
0 votes
When using an ASCII file with several lines of headers, the best approach is to use the importdata() function. You can tell it how many lines of the file comprise the header, and it will load subsequent data in a numerical format.
1 Comment
Walter Roberson
on 31 Mar 2016
That would not meet the requirement the user had for "the data to remain unchanged." Also, the data is not all in numeric format: there are two fields which contain time strings. importdata() would return a scalar struct that would have to be examined for its 'data' field and its 'textdata' field, and it would be necessary to figure out how textimport handles such things when the text fields might occur in the middle of a line. It is not clear that that is "best".
It would make more sense to use readtable() from R2013b onward, especially from R2014b onward (when it gained datetime handling), as that does not change the order of fields and is well defined as to how the various field types are handled.
Categories
Find more on Text Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!