How can I determine the number of Headerlines for varying, non-rectangular text files so that I can parse it with textscan?

Question

Shawn on 30 Apr 2014

0
Link

Direct link to this question

https://www.mathworks.com/matlabcentral/answers/127779-how-can-i-determine-the-number-of-headerlines-for-varying-non-rectangular-text-files-so-that-i-can

Edited: per isakson on 1 May 2014

I would like to use textscan to read in the tabular integer and floating point data, keying off of the *NODE line. This line can be anywhere in the file with other, non-comment strings and integer lines in there as well. How can I find the varying number of headerlines for any given input file?

My code and example input file are as follows, Thanks!

fid4 = fopen('E:\scratch\ANSYS_macro\MATLAB dyna beams\sample.k'); g = textscan(fid4,'%d %f %f %f','Delimiter','\n','headerlines',15); celldisp(g); fclose(fid4);

*KEYWORD
*TITLE
*DATABASE_FORMAT
$   1IFORM  2IBINARY
         0
$
$
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$                               NODE DEFINITIONS                               $
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
$
*NODE
$   1NID              2X              3Y              4Z     5TC     6RC
       1        0.141746         0.55315     -0.00592088
       2        0.141746        0.538028     -0.00592088
       3        0.126928         0.55315     -0.00669746
       4        0.126926        0.538027     -0.00669757
       5        0.112141         0.55315     -0.00747244
       6        0.112138        0.538025     -0.00747256
       7       0.0973459         0.55315      -0.0082478
       8       0.0973435        0.538024     -0.00824792
       9       0.0825538         0.55315     -0.00902302
      10       0.0825514        0.538022     -0.00902315
      11       0.0677682         0.55315      -0.0097979
$

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

per isakson on 30 Apr 2014

1
Link

Direct link to this answer

https://www.mathworks.com/matlabcentral/answers/127779-how-can-i-determine-the-number-of-headerlines-for-varying-non-rectangular-text-files-so-that-i-can#answer_135208

Edited: per isakson on 1 May 2014

Open in MATLAB Online

I'm not sure I understand whether your file contains one or more blocks of numerical data. Here is a file that handles both cases.

    function    g = ccsm()
        str = fileread( 'cssm.txt' );
        cac = regexp(str,'(?<=\*NODE\s+).+?(?=((\*KEYWORD)|($)))','match');
        g   = cell( 1, length( cac ) ); 
        for jj = 1 : length( cac )
            g{jj} = textscan(cac{jj},'%d%f%f%f', 'CommentStyle','$');        
        end
    end

returns a cell array g, where

    >> g{3}
    ans = 
        [11x1 int32]    [11x1 double]    [11x1 double]    [11x1 double]

and where cssm.txt contains three copies of text you included in your question.

.

Comments:

the entire text file must fit in memory to use this approach
it is not possible read and parse the file in one step with textscan
it is safer to use a definition of the file format than guess based on one sample
regexp is powerful and fast, but ... . The expression I used assumes that blocks of numerical data are enclosed by "*NOTE" and "*KEYWORD" or by "*NOTE" and end of file.

1 Comment
Show -1 older commentsHide -1 older comments

Cedric on 1 May 2014

Edited: Cedric on 1 May 2014

Open in MATLAB Online

Ah, a regexp challenge, I take it! ;-)

I'd propose the following:

blocks = regexp( content, '\*NODE(.*?\n){2}([\s\d\-\.]+)', 'tokens' ) ;

if the block doesn't always end with a $ character, and

blocks = regexp( content, '\*NODE(.*?\n){2}([^$]+)', 'tokens' ) ;

if it does. Then, blocks{1} (only cell if there is only one block) is a cell array whose cell 1 contains the header, and whose cell 2 contains the data.

Sign in to comment.

How can I determine the number of Headerlines for varying, non-rectangular text files so that I can parse it with textscan?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

How can I determine the number of Headerlines for varying, non-rectangular text files so that I can parse it with textscan?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments