Note
The dataset data type is not recommended. To
work with heterogeneous data, use the MATLAB®
table data type instead. See MATLAB
table documentation for more information.
There are many ways to index into dataset arrays. For example,
for a dataset array, ds, you can:
Use () to create a new dataset
array from a subset of ds. For example, ds1
= ds(1:5,:) creates a new dataset array, ds1,
consisting of the first five rows of ds. Metadata,
including variable and observation names, transfers to the new dataset
array.
Use variable names with dot notation to index individual
variables in a dataset array. For example, ds.Height indexes
the variable named Height.
Use observation names to index individual observations
in a dataset array. For example, ds('Obs1',:) gives
data for the observation named Obs1.
Use observation or variable numbers. For example, ds(:,[1,3,5]) gives
the data in the first, third, and fifth variables (columns) of ds.
Use logical indexing to search for observations in ds that
satisfy a logical condition. For example, ds(ds.Gender=='Male',:) gives
the observations in ds where the variable named Gender,
a nominal array, has the value Male.
Use ismissing to find missing data
in the dataset array.
This example shows several indexing and searching methods for categorical arrays.
Load the sample data.
load hospital;
size(hospital)ans = 1×2
100 7
The dataset array has 100 observations and 7 variables.
Index a variable by name. Return the minimum age in the dataset array.
min(hospital.Age)
ans = 25
Delete the variable Trials.
hospital.Trials = []; size(hospital)
ans = 1×2
100 6
Index an observation by name. Display measurements on the first five variables for the observation named PUE-347.
hospital('PUE-347',1:5)ans =
LastName Sex Age Weight Smoker
PUE-347 {'YOUNG'} Female 25 114 false
Index variables by number. Create a new dataset array containing the first four variables of hospital.
dsNew = hospital(:,1:4); dsNew.Properties.VarNames(:)
ans = 4x1 cell
{'LastName'}
{'Sex' }
{'Age' }
{'Weight' }
Index observations by number. Delete the last 10 observations.
hospital(end-9:end,:) = []; size(hospital)
ans = 1×2
90 6
Search for observations by logical condition. Create a new dataset array containing only females who smoke.
dsFS = hospital(hospital.Sex=='Female' & hospital.Smoker==true,:); dsFS(:,{'LastName','Sex','Smoker'})
ans =
LastName Sex Smoker
LPD-746 {'MILLER' } Female true
XBR-291 {'GARCIA' } Female true
AAX-056 {'LEE' } Female true
DTT-578 {'WALKER' } Female true
AFK-336 {'WRIGHT' } Female true
RBA-579 {'SANCHEZ' } Female true
HAK-381 {'MORRIS' } Female true
NSK-403 {'RAMIREZ' } Female true
ILS-109 {'WATSON' } Female true
JDR-456 {'SANDERS' } Female true
HWZ-321 {'PATTERSON'} Female true
GGU-691 {'HUGHES' } Female true
WUS-105 {'FLORES' } Female true