注册 | 登录

解决statistics - MATLAB: Taking sample with same number of values from each class

itPublisher 分享于



I have a full dataset of lets say 50000 observations which are assigned to 16 classes. I now want to draw a Sample of let's say 70% of the full data, but I want MATLAB to take the same number of samples from each class (if possible of course, because some classes have less numbers than needed)

Is there a MATLAB function that can do this, or do I have to program a new one for myself? I'm just trying to save time here.

I found cvpartition, but as far as I know this can be used only to take a sample with the same distribution over the classes as the original dataset and not a uniformly distributed sample.

Thank you for your help!

matlab statistics distribution sampling
  this question
edited Mar 27 '13 at 10:22 asked Mar 27 '13 at 10:16 mischa.mole 122 6      For the small groups you may want to sample each value more than once. At least this will get you an equal amount of observations per group. –  Dennis Jaheruddin Mar 27 '13 at 10:58      possible duplicate of Load all the images from a directory –  Iswanto San Mar 28 '13 at 0:07


1 Answers


It shouldn't be too hard. Let's say that the observations are in a vector observations. Then you can do

fraction = 0.7;

classes = unique(observations);
nObs = length(observations);
nClasses = length(classes);
nSamples = round(nObs * fraction / nClasses);

for ii = 1:nClasses
    idx = observations == classes(ii);
    samples((ii-1)*nSamples+1:ii*nSamples) = randsample(observations(idx), nSamples);

Now samples is a vector of length nClasses * nsamples that contains your sampled observations, with an equal number from each class.

At the moment it will fail if one of the classes doesn't contain at least nSamples observations. The simplest fix is to add the additional arguments 'replace','true' to the call to randsample, which will tell it to replace each observation after being picked.

  this answer
answered Mar 27 '13 at 10:52 Chris Taylor 33.9k 8 81 136      Thank you, saves me some thinking time :-) I just thought maybe there is a builtin Matlab function that can do that....BR, Mischa –  mischa.mole Mar 27 '13 at 10:58









您的注册邮箱: 修改

重新发送激活邮件 进入我的邮箱