Replace values in matrix (text to value)

1 view (last 30 days)
Muneer
Muneer on 21 Oct 2013
Commented: Cedric on 22 Oct 2013
Hi,
I have a matrix with 44 columns with the following headings:
id trip time objID status1 status2 status3 ... status40
I used textscan to read the data into one block. The problem is the "status" columns have "NULL" values as their first few values, but the columns are all different, ie. status1 has NULL for its first 5 rows, status2 for its first 3 rows, status3 for its first 10 rows, etc. Textscan has the "headerlines" argument that removes a specified number of rows from the beginning of the matrix, but is there anything I can do to remove a varying amount of rows? To ensure all the columns have the same number of values, I was going to replace all of the NULL values with a 0. Any idea on how to do that?
Thanks for the help.
  4 Comments
Cedric
Cedric on 21 Oct 2013
Edited: Cedric on 21 Oct 2013
The data file. Or a sample with 20-50 lines if it's too large.
Muneer
Muneer on 21 Oct 2013
Here's an image of the data because to get a better idea. Unfortunately, I can't give you a file because of security issues. But hopefully this helps.
I appreciate the help.

Sign in to comment.

Accepted Answer

Cedric
Cedric on 21 Oct 2013
Edited: Cedric on 21 Oct 2013
Here is a small example using the following content
vehicle trip_time A B C
C23432 1234556 NULL NULL NULL
C23432 1234557 1 NULL NULL
C23432 1234558 2 NULL 100
C23432 1234559 3 10 200
C23432 1234560 4 20 300
the question that remains is what you want/need NULL entries to be converted to.
fid = fopen( 'myFile.txt', 'r' ) ;
content = textscan( fid, '%s %f %s %s %s', 'headerlines', 1 ) ;
fclose( fid ) ;
for c = 3 : numel( content ), content{c} = str2double( content{c} ) ; end
vehicle = content{1} ;
data = [content{2:end}] ;
With that you get
>> vehicle
vehicle =
'C23432'
'C23432'
'C23432'
'C23432'
'C23432'
which is a cell array of strings, and a numeric array for the rest of the data, with NaN entries for NULL values (seems appropriate)..
>> data
data =
1234556 NaN NaN NaN
1234557 1 NaN NaN
1234558 2 NaN 100
1234559 3 10 200
1234560 4 20 300
I'd personally work with that, but if you wanted zeros instead of NaNs, you could proceed as follows:
>> data(isnan(data)) = 0
data =
1234556 0 0 0
1234557 1 0 0
1234558 2 0 100
1234559 3 10 200
1234560 4 20 300
It's more handy for e.g. plotting, but you loose the information about NULL entries, because you can't say anymore which were initially NULL and which were true zeros.
Let me know if it's not exactly what you wanted to achieve, and we can refine the answer.
  10 Comments
Muneer
Muneer on 22 Oct 2013
Maybe the for loop is doing the headerlines operation multiple times and deleting the top row each time?
Thanks for your help
Cedric
Cedric on 22 Oct 2013
I just replied to your email.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!