Split information into two columns

12 views (last 30 days)
Marta Ramos
Marta Ramos on 28 Jun 2018
Edited: Jan on 28 Jun 2018
Hello, I have an excel file and it has 2 columns and a lot of lines. The first column has numbers and the second column has words like this "House_red" or "House_x_red" or "House". I used this to extract the second column
[~,houses] = xlsread('houses.xlsx', 'B:B');
And right now I have all my information separated numbers from strings, but in the strings, I want to have a column with House and the other with Red or x_red, where house will be in the first column and red or x_red in the second column. Is there a way to split a string into two columns knowing that one line can be "House_Red" and another line is "House_X_Red"?
  2 Comments
Jan
Jan on 28 Jun 2018
I don't get what you want. Can you post a small example of the inputs and the wanted output?
Marta Ramos
Marta Ramos on 28 Jun 2018

yes, I am sorry.

Input:
House_Red
House_Yellow
House
House_X_Red
House_X_Purple
 Ouput: -> 2 columns
1stColumn       2ndColumn
House            Red
House            Yellow
House            undefined
House            X_Red
House            X_Purple

I hope i made it understandable. Thank you

Sign in to comment.

Accepted Answer

Star Strider
Star Strider on 28 Jun 2018
Try this:
strs = {"House_red"; "House_x_red"; "House"; "Apartment"}; % Data
chrs = cellfun(@char, strs, 'Uni',0); % Convert ‘string’ To ‘char’
L1 = cellfun(@isempty, strfind(chrs, '_')); % Find Elements Without Underscores ‘_’
chrs(L1) = cellfun(@(x)sprintf('%s_', x), chrs(L1), 'Uni',0); % Add Terminating Underscores To Them
sep = regexp(chrs, '_', 'split', 'once'); % Split On First Underscore
H = cellfun(@(x)x(:,1), sep);
C = cellfun(@(x)x(:,2), sep);
HC = [H, C]
HC =
4×2 cell array
{'House' } {'red' }
{'House' } {'x_red' }
{'House' } {0×0 char}
{'Apartment'} {0×0 char}
You indicated that your original data was a string array, so I included a step that converts it to a char array. The code resolves the problem of "House" not having an underscore (_) by adding one before splitting the char array ‘chrs’ with by the first underscore. This is the easiest way I can devise to create an appropriate output for ‘sep’, so that each row vector is a (1x2) cell array. After that, the rest is straightforward.
  3 Comments
Star Strider
Star Strider on 28 Jun 2018
As always, my pleasure!
Your Questions have been educational for me.
Jan
Jan on 28 Jun 2018
Edited: Jan on 28 Jun 2018
Some simplifications:
% chrs = cellfun(@char, strs, 'Uni',0)
chrs = cellstr(strs)
% L1 = cellfun(@isempty, strfind(chrs, '_'))
L1 = ~contains(chrs, '_')
% chrs(L1) = cellfun(@(x)sprintf('%s_', x), chrs(L1), 'Uni',0)
chrs(L1) = strcat(chrs(L1), '_')

Sign in to comment.

More Answers (1)

Jan
Jan on 28 Jun 2018
Edited: Jan on 28 Jun 2018
Str = {'House_Red'; ...
'House_Yellow'; ...
'House'; ...
'House_X_Red'; ...
'House_X_Purple'};
[C1, C2] = strtok(Str, '_'); % Split strings at 1st _
C2 = cellfun(@(x) x(2:end), C2, 'UniformOutput', 0); % Remove leading _
C2(cellfun('isempty', C2)) = {'undefined'}; % If wanted
Result = [C1, C2] % Join columns

Categories

Find more on Characters and Strings in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!