Yadif goes through every plane of every frame. It assumes everything is interlaced. Because of this assumption, yadif starts out by focusing on only half the lines and rewriting the alternate half of the lines to match them.
For clarity, this guide deals with top field first content.
If yadif's going top field first, it will keep the first line in every frame, and then go through the second line pixel by pixel to make it match the first, keep the third line, go through the 4th pixel by pixel, etc. As it goes through, it discards the intensity of each pixel of each of those even lines, and makes a bunch of guesses about what it should be.
The guessing process starts with a crude approximation of what the frame would look like if the odd field was a full-height progressive frame. It knows that the lines directly above and below the current pixel on an even line are "good" (from the first field) and that the ones 2 up and 2 down are "bad" (from the second field) and should be mostly ignored.
Yadif also realizes that, since interlacing alternates, the order is opposite in the frames before and after. After all, the previous moment in time is the even field in the previous frame, and the next moment in time is the odd field in the current frame. In those, the pixels 2 up and 2 down are "good" and the ones directly above and below are "bad." So for the pixel 2 above in the current frame and the pixel 2 below, it uses the average of that pixel in the fields before and after. Meanwhile, it preserves the pixels directly above and below. The current pixel is treated like the ones 2 above and below -- it's ignored and replaced with the average of the pixel in the fields before and after.
This process gives yadif an approximation of what the frame should look like if it were progressive and took place at the time of the first field. Yadif labels these guessed pixels, from the one 2 above to the one 2 below: b, c, d, e, and f.
Initial Spatial Prediction and Score
Then, it makes an initial guess as to what the current pixel should be. It does a simple 2-tap interpolation: the average of the guessed pixels directly above and below, c and e. This is the initial spatial prediction for the picture.
Yadif also generates a spatial score. This is a vertical change. This is the difference between the pixels to the left of c and e, plus the difference between c and e, plus the difference between the pixels to the right of c and e. Then, for some reason I cannot fathom, 1 is subtracted from that sum.
Now it's time for the in-depth spatial prediction of YADIF_CHECK, which adjusts the spatial prediction after searching the current frame a bit. It first looks for change going from the upper left to the bottom right in the current frame.
If this is greater than the spatial score (which starts off looking at vertical change), it replaces the spatial score, and the spatial prediction is changed from a pure vertical interpolation of the pixels above and below to a diagonal interpolation of the pixel above to the left and below to the right. Then, yadif proceeds to check for more drastic upper-left-to-bottom-right change, by repeating its check, widening its inspection one more to the left and right.
If this scores higher than the previous left-to-right pass, it replaces it, and the spatial prediction is changed to the average of the pixel 2 to the left above and 2 to the right below.
Regardless of whether either of the previous passes changed anything, the check is run again, now in the other direction, looking for change going from the upper right to the lower left.
If this is greater than the spatial core, (which at this point could represent vertical change or diagonal change going upper left to bottom right) it replaces the score, and the spatial prediction is changed to the average of the pixel above to the right and the pixel below to the left. Then, yadif checks for more drastic upper-right-to-bottom-left change, by repeating its check with a wider range.
If this difference is greater than the last right-to-left pass, it replaces the score, and the spatial prediction is updated to the average of the pixel 2 to the right above and 2 to the left below.
Here's what they all look like at once:
With the spatial prediction in hand, it's on to the temporal differences. Diff0 is how the current pixel changes from the previous to the next fields. Diff1 is the average of how the pixels above and below change from the past to the present. Diff2 is the average of how the pixels above and below change from the present to the future.
The final difference, diff, is set which ever's larger, diff1, diff2, or half of diff0.
And that's where diff is left, if using yadif in mode 2 or 3. But in modes 0 and 1, more is done to the interpolated frame...
The Special Sauce for Yadif's Slower Modes 0 and 1
A bunch of comparisons have to be made on the interpolated frame to pick the right value for the current pixel. Remember, the interpolations for the current pixel (d) and the ones directly above and below (c and e) are based on the current field, while the predictions for the two outer lines (b and f) are based on the previous and next fields. Should more weight be attached to the changes over space, or the changes over time?
First, a maximum diff, max, has to be found. Which has the biggest change? Going from the current pixel to the one below (spatially down), from the current to the one above (spatially up), or the smaller of the differences between the outer 2 lines on the top and bottom (temporally forwards or backwards)? Then, a minimum diff, min, has to be found. Which is the smallest change? Going from the current pixel to the one below, from the current to the one above, or the larger of the differences between the outer lines on the top and bottom?
With that info, a true final diff value can be found. It is the largest of the three: the existing diff, the minimum change just calculated, or the inverse of the maximum change just calculated. And that's all modes 2 and 3 add on top of modes 0 and 1.
The spatial prediction is now tweaked one last time. If the current prediction is bigger than the interpolated value of d plus the diff, then the prediction becomes d plus the diff. If the current prediction is smaller than the interpolated value of d minus the diff, then the prediction becomes d minus the diff.
That's it. On to the next pixel, ad nauseam.
- spatial-pred.png (25.1 KB) - added by jbrjake 6 years ago.
- check-1.png (148.6 KB) - added by jbrjake 6 years ago.
- check-2.png (156.7 KB) - added by jbrjake 6 years ago.
- check-3.png (136.7 KB) - added by jbrjake 6 years ago.
- check-4.png (149.3 KB) - added by jbrjake 6 years ago.
- check-all.png (181.6 KB) - added by jbrjake 6 years ago.
- frames-plain.png (128.7 KB) - added by jbrjake 6 years ago.
- spatial-score.png (107.7 KB) - added by jbrjake 6 years ago.
- minmax.png (49.6 KB) - added by jbrjake 6 years ago.
- spatial-score-fixed.png (47.7 KB) - added by jbrjake 6 years ago.
- interpolation.png (131.0 KB) - added by jbrjake 6 years ago.
- temporal-diffs.png (118.2 KB) - added by jbrjake 6 years ago.
Download all attachments as: .zip