This example shows how to convert a variable in a table from a cell array of character vectors to a categorical array. The same workflow applies for table variables that are string arrays.
Load sample data gathered from 100 patients.
load patients
whosName Size Bytes Class Attributes Age 100x1 800 double Diastolic 100x1 800 double Gender 100x1 11412 cell Height 100x1 800 double LastName 100x1 11616 cell Location 100x1 14208 cell SelfAssessedHealthStatus 100x1 11540 cell Smoker 100x1 100 logical Systolic 100x1 800 double Weight 100x1 800 double
Store the patient data from Age, Gender, Height, Weight, SelfAssessedHealthStatus, and Location in a table. Use the unique identifiers in the variable LastName as row names.
T = table(Age,Gender,Height,Weight,... SelfAssessedHealthStatus,Location,... 'RowNames',LastName);
The cell arrays of character vectors, Gender and Location, contain discrete sets of unique values.
Convert Gender and Location to categorical arrays.
T.Gender = categorical(T.Gender); T.Location = categorical(T.Location);
The variable, SelfAssessedHealthStatus, contains four unique values: Excellent, Fair, Good, and Poor.
Convert SelfAssessedHealthStatus to an ordinal categorical array, such that the categories have the mathematical ordering Poor < Fair < Good < Excellent.
T.SelfAssessedHealthStatus = categorical(T.SelfAssessedHealthStatus,... {'Poor','Fair','Good','Excellent'},'Ordinal',true);
View the data type, description, units, and other descriptive statistics for each variable by using summary to summarize the table.
format compact
summary(T)Variables:
Age: 100x1 double
Values:
Min 25
Median 39
Max 50
Gender: 100x1 categorical
Values:
Female 53
Male 47
Height: 100x1 double
Values:
Min 60
Median 67
Max 72
Weight: 100x1 double
Values:
Min 111
Median 142.5
Max 202
SelfAssessedHealthStatus: 100x1 ordinal categorical
Values:
Poor 11
Fair 15
Good 40
Excellent 34
Location: 100x1 categorical
Values:
County General Hospital 39
St. Mary s Medical Center 24
VA Hospital 37
The table variables Gender, SelfAssessedHealthStatus, and Location are categorical arrays. The summary contains the counts of the number of elements in each category. For example, the summary indicates that 53 of the 100 patients are female and 47 are male.
Create a subtable, T1, containing the age, height, and weight of all female patients who were observed at County General Hospital. You can easily create a logical vector based on the values in the categorical arrays Gender and Location.
rows = T.Location=='County General Hospital' & T.Gender=='Female';
rows is a 100-by-1 logical vector with logical true (1) for the table rows where the gender is female and the location is County General Hospital.
Define the subset of variables.
vars = {'Age','Height','Weight'};Use parentheses to create the subtable, T1.
T1 = T(rows,vars)
T1=19×3 table
Age Height Weight
___ ______ ______
Brown 49 64 119
Taylor 31 66 132
Anderson 45 68 128
Lee 44 66 146
Walker 28 65 123
Young 25 63 114
Campbell 37 65 135
Evans 39 62 121
Morris 43 64 135
Rivera 29 63 130
Richardson 30 67 141
Cox 28 66 111
Torres 45 70 137
Peterson 32 60 136
Ramirez 48 64 137
Bennett 35 64 131
⋮
A is a 19-by-3 table.
Since ordinal categorical arrays have a mathematical ordering for their categories, you can perform element-wise comparisons of them with relational operations, such as greater than and less than.
Create a subtable, T2, of the gender, age, height, and weight of all patients who assessed their health status as poor or fair.
First, define the subset of rows to include in table T2.
rows = T.SelfAssessedHealthStatus<='Fair';Then, define the subset of variables to include in table T2.
vars = {'Gender','Age','Height','Weight'};Use parentheses to create the subtable T2.
T2 = T(rows,vars)
T2=26×4 table
Gender Age Height Weight
______ ___ ______ ______
Johnson Male 43 69 163
Jones Female 40 67 133
Thomas Female 42 66 137
Jackson Male 25 71 174
Garcia Female 27 69 131
Rodriguez Female 39 64 117
Lewis Female 41 62 137
Lee Female 44 66 146
Hall Male 25 70 189
Hernandez Male 36 68 166
Lopez Female 40 66 137
Gonzalez Female 35 66 118
Mitchell Male 39 71 164
Campbell Female 37 65 135
Parker Male 30 68 182
Stewart Male 49 68 170
⋮
T2 is a 26-by-4 table.