Textscan read too many lines
2 views (last 30 days)
Show older comments
I want to read XML file that contain few data sections (xml file of spectrometer with more than one measure point)
The data structure is as follows: (more or less)
<SpectrumData>
2. <X>50000</X>
3. <Y>54000</Y>
4. <Z>75000</Z>
5. </ActualInfo>
6. <I WL="235" R="0.101973" />
7. <I WL="236" R="0.098217" />
8. <I WL="237" R="0.093565" />
…
731. <I WL="968" R="0.158408" />
732. <I WL="969" R="0.158925" />
733. </SpectrumData>
734. <SpectrumData>
735. <X>100</X>
736. <Y>120</Y>
737. <Z>250</Z>
738. </ActualInfo>
739. <I WL="235" R="0.201973" />
740. <I WL="236" R="0.198217" />
741. <I WL="237" R="0.193565" />
…
1472. <I WL="968 " R="0.158408" />
1473. </SpectrumData>
This structure repeat itself two to six times (depend on the number of measured points)
First I used
tline = fgetl(fid);
SpectrumData = textscan(fid,'%*s %f %*s',3,'delimiter','>');
To read X,Y,Z And then I use
C = textscan(fid,'%*s %f %*s %f %*s', 'delimiter','"');
To read all WL and R
The problem is textscan read 3 more lines after the last WL (and the last cells in C are NaN) How can I stop textscan in the correct line, note that I can’t use
C = textscan(fid,'%*s %f %*s %f %*s', 733,'delimiter','"');
Because the number of WL can be change from file to file
1 Comment
dpb
on 27 Sep 2014
Unless there's a piece of data somewhere in the file that contains the number of lines so you can set the repeat count in textscan programmatically, you can't do anything specific to the count a priori.
Probably there's a regexp pattern that could be written to find the terminator but I'm not expert enough to work it out w/o a fair amount of effort so won't give a go.
Otherwise, either
a) continue to read/parse the file line-by-line looking for the termination string, or
b) read the whole file into memory as array of cell strings then locate the sections by string comparison operations and operate on those in memory.
Accepted Answer
per isakson
on 27 Sep 2014
Edited: per isakson
on 27 Sep 2014
In this order
- See xmlread, Read XML document and return Document Object Model node
- Search File Exchange for read XML. Both xml_io_tools by Jaroslaw Tuszynski and xml2struct by Wouter Falkena are rated highly by many.
- Make your own function. IMO: best approach is to read with fileread and to parse with regexp
0 Comments
More Answers (0)
See Also
Categories
Find more on String Parsing in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!