解决aggregation - MATLAB: Aggregate and compare measurements with different sampling interval
I have measured a variable
x in equidistant long intervals (every 10 min) and a variable
y in non-equidistant short intervals (somewhere between every 30 s and 90 s). Timestamps (
datenum) for both
y are available, but they are never equal, so
intersect doesn't work. How can I aggregate
mean(y(...)) in interval
x(i+1) - x(i)) so I can compare the two (e.g. plot them against each other or plot them with the same time-vector)?
/edit 1: Confused
y in my last but one sentence.
/edit 2: I feel like I didn't give you enough information in the original question, sorry for that. Many of you suggest interpolation.
x is an average wind speed over a period of 10 minutes, not a distinct measurement. So if I say time = 07:10 and
x = 3 m/s, that means
mean(x) = 3 m/s for the period from 07:00 to 07:10. This is why I think it's probably not the best idea to interpolate it.
y is one of many (very noisy) other variables and I want to find out the influence of (mean)
y. So I would either like to assign many values of
y to one measurement of
x (in that 10 minute period), or assign a
mean(y) to that one measurement of
x. I assume that the solutions are quite similar, code wise.
this question edited Mar 13 '13 at 17:02 asked Mar 13 '13 at 16:25 Fred S 462 5 19 How about interpolating one of the results to another's time? (if the signal is not too noisy) – Dedek Mraz Mar 13 '13 at 16:30 I thought about that. Sadly, they are quite noisy by nature and I fear that interpolation would be somewhat critical in the context of my research (for reasons not specified here). – Fred S Mar 13 '13 at 16:34 1 Don't you mean aggregating
y(as it's the longer array)? – Eitan T Mar 13 '13 at 16:49 1 I empathize with you on the real-world signals. Been there. The only option for actual comparison (not plotting) is interpolation. Linear is the simplest, but Matlab has many. I mostly used
spline. Depending on your signal properties, you could first filter (lowpass, sliding average) the noisy data and then interpolate. You could also show us the data plot. – Dedek Mraz Mar 13 '13 at 16:55 EitanT: Yes, I edited my submission. Sorry for the confusion. Dedek Mraz: Thank you for your suggestions. Please see edit 2. – Fred S Mar 13 '13 at 16:57 | show more comments
Reading the edited (2x) question:
You are trying to estimate the value of
x at some point that you don't have the measurement for. You have measurements before and after. The only thing you can do is to interpolate. What method you choose is somewhat harder to decide.
Your options are:
- piecewise constant interpolation (what I think you are suggesting)
- linear interpolation
/edit: If you just want to get an average value of
y between two
x measurements, I suggest the following:
new_y = zeros(size(x)); new_y(1) = mean(y(ty<=tx(1))); for ii=2:length(x): new_y(ii) = mean(y(and(ty>tx(ii-1),ty<=tx(ii)))); end
Maybe an even better solution would be using hist:
n = hist(ty,tx)
n contains the number of values of
ty that are closest to values in
tx. Since both are monotonous,
n tells you how to group values in
y. Then you can use
mat2cell to put
y into a cell array where each cell corresponds to one measurement of
x. The second parameter
n now specifies how many values to put in each cell.
new_y = mat2cell(y,n)
this answer edited Mar 13 '13 at 18:25 answered Mar 13 '13 at 17:14 Dedek Mraz 607 5 13 I'm not trying to estimate / interpolate
x. Actually, I don't want to touch
xat all. Sorry for not being clear enough, English isn't my first language. What I want to do is aggregate (by averaging)
yand assign them to one value / timestamp of
nbeing the number of measurements for
yI have in the interval
xwas averaged over (i.e., in a 10 minute interval before the timestamp of
x). And no, sadly I don't have the original measurements. I think they aren't even saved at all. – Fred S Mar 13 '13 at 17:19 1 I'm sorry for forcing interpolation on you but I think this is the only option if you want to do it "right". And I believe the influence of
ycan best be seen this way. I'll update the answer. – Dedek Mraz Mar 13 '13 at 18:01 I guess we disagree on the "right way" then. ;) I'd rather omit than fabricate information, but I can of course see why one would prefer interpolation. Anyway, thank you very much for your suggestions. I will certainly try them out and compare them to EitanT's solution and come back with some results in case anyone is interested. – Fred S Mar 13 '13 at 20:25 Follow up, as promised: Both of your solutions after the edit seem to do exactly what I want and are somewhat more flexible with respect to variable time steps than EitanT's suggestion. I have yet to find out how to work with the output of your hist()-based solution, but I'll get aorund that. So, thanks a lot! – Fred S Mar 17 '13 at 14:57
To aggregate values, use
accumarray(fix(ty(:) / T) + 1, y, , @mean)
y is the sampled signal,
ty is the timestamp array and
T is the time interval of the aggregated values (for example,
T = 10 / (24 * 60) = 0.0069 for a 10-minute interval).
this answer edited Mar 13 '13 at 17:27 answered Mar 13 '13 at 16:47 Eitan T 28.4k 11 43 79 Can't really get that to work, but I think this might be what I'm looking for.
Error using accumarray: Second input VAL must be a vector with one element for each row in SUBS, or a scalar.You can find sample code here: pastebin.com/KP9nhEib . Thanks so far. – Fred S Mar 13 '13 at 17:11 1 @FredS
tyhas to be a column vector, so just use
ty(:). I've amended my answer. – Eitan T Mar 13 '13 at 17:28 Ah, thanks. While this does what I want for my sample code (although it creates one entry too much, but I think I'll figure that out), I have a problem with the
/ Tpart because I have a few data logger defects where sampling intervals for
xare 20 minutes (didn't actually see that until now - very long data set, sorry). Any idea what to do in this case? I have uploaded a short part of my data here: dl.dropbox.com/u/9437411/sample.mat – Fred S Mar 13 '13 at 20:40
You can interpolate data from x to non-equidistant timestamps (or vice versa) (see interp1 function) and compare results.
plot(Time_x, x, Time_y, y)
this answer answered Mar 13 '13 at 16:38 Fedyanint Tim 35 1 4
Here's a simple example of using the 1-d interpolation.
# make two example functions on different x bases. x1 = [0:.023:10]; x2 = [0:1:10]; y1 = x1.^2/10; y2 = 10 - x2.^1.3; # convert both to a common x base (x1 in this case). y2i = interp1(x2,y2,x1); plot(x1,y1,x1,y2i)
this answer answered Mar 13 '13 at 16:43 Stuart 612 4 10
Use linear interpolation!
Its easy & fun to do yourself. The idea is: Since you know the timestamps for x, the values for x, and the values for y, (but the timestamps for y don't match that of x), you can use linear interpolation (or of higher order if you need) to interpolate/"update" values for y as if they occured at the timestamps of x. After that you can plot both x and interpolated y values against the same x timestamp vector.
this answer answered Mar 13 '13 at 16:45 FredrikRedin 1,128 8 18
- 4Matlab textread函数详解
- 5Matlab 线性拟合 & 非线性拟合