Dfftest so slow.

Caroliano · Post by **Caroliano** » Sat Jun 19, 2010 10:18 pm

Dfftest is already multithreaded (thread=0 by default, that means automatic detection). I don't think MT can make it faster.

The videos are some old xvid encodes w/o much detail and quite a bit ringing and blocking. I wanted some stronger filtering, not temporal-only filters like ttempsmooth that don't make visible changes (but yeah, it is very good).

I never got very well at fft3dfilter/fft3dgpu. I tried it once and all I could do was create new ringing and blocking. Any sugestions on fft3d configuration from where I can tweak the strenght? I will try the other filters in the mean time.

mirkosp · Post by **mirkosp** » Sun Jun 20, 2010 1:59 am

Caroliano wrote:Any sugestions on fft3d configuration from where I can tweak the strenght?

Two stage fft3d is godly.

Code: Select all

strength = X
fft3dfilter(bw=6, bh=6, ow=3, oh=3, plane=0, bt=1, sigma=strength)
fft3dfilter(bw=216, bh=216, ow=108, oh=108, plane=0, bt=1, sigma=strength/8, sigma2=strength/4, sigma3=strength/2, sigma4=strength)

Where instead of X you put the strength you want. You can also play with the various sigmas in the second call of the filter. Beta is another parameter you might be interested into.

As a side note, I think Mister Hatt has a standalone fft3dgpu application which runs faster than the one in avisynth... if he's willing to share, you might be interested in it.

Caroliano · Post by **Caroliano** » Sun Jun 20, 2010 5:33 pm

Results: My previous filtering was only Dfttest(sigma=1.5, tbsize=3), that runs at 6,8fps (ab)using the four cores. The 2 stage fft3d is really godly, in some scenes better than dfttest(). But the one you posted runs at 6,2fps, using only an single core, and if I use fft3dgpu instead, it goes 7,2fps, on my weak HD3300 (onboard on 790GX chipset). So, even if in another hardware configuration it would be significantly faster than dfttest, for me it don't quite cut it. But I will be using it in my 1fps scripts.

Nevertheless, that gave me the idea of using an spatial only dfttest: Dfttest(sigma=1.5, tbsize=1) runs at 20~22fps, much better. If I add ttempsmooth(maxr=3) after it, the encode becomes bottlenecked in one of the cores, and the speed goes down to 13fps.

Last, HQND3D() is fast, but don't like DCT blocks, so I used deblock().HQND3D() and it ran at 60fps. Now, that is fast. The result of course is not as good as the the slower ones, but I guess it is ok for the speed.

Oh, I made an simple .avsi for the two stage fft3d, like I did for MdegainX(), only to make easier to use w/o lots of copy-paste. Here it is:

Spoiler :

Code: Select all

################################################################################################
###                                                                                          ###
###                        Simple FFt3dfilter for anime - fft3danime()                       ###
###                                                                                          ###
###                                       By Caroliano                                       ###
###                                                                                          ###
###                                  Special Thanks:  mirkosp                                ###
###                                                                                          ###
###                                    v0.2 - 20 Jun 2010                                    ###
###                                                                                          ###
################################################################################################
###
### +-----------+
### | CHANGELOG |
### +-----------+
###
### v0.2 : - Adapted the introductory section from SMDegrain
###        - Released
###
###
### v0.1 : - Basic working version
###        - Not released
###
###
### +--------------+
### | DEPENDENCIES |
### +--------------+
###
### -> ff3dfilter and/or ff3dgpu
### -> ttempsmooth
###
###
###
### +------------------+
### | BASIC PARAMETERS |
### +------------------+
###
###
### strength [float] 
### --------------
### The higher, the stronger the denoising
### Default is 2.
###
###
### GPU [bool] 
### --------------
### Use fft3dgpu() or fft3dfilter()
### Default is false
###
###
### ttempsmooth [int: 0-7] 
### --------------
### Temporal estabilization. The parameter set the numbers of foward and backwards frames used. 
### If set to 0, this means: no temporal stabilization
### If set to 1, this means: Current + 1 foward + 1 backward 
### If set to 2, this means: Current + 2 foward + 2 backward 
### And so on. It is a fast filter, and safe, so it is possible to use very high numbers of frames.
### 7 is the maximum range.
### Default is 0
###
###
###
### +---------------------+
### | ADVANCED PARAMETERS |
### +---------------------+
###
### None yet
### 
### +-------------+
### | FINAL NOTES |
### +-------------+
###
### If there is an important parameter not implemented, please ask.
###
### Many paremeters will likely never be implemented, 
### 
################################################################################################
  
  
  
function fft3danime ( clip input, float "strength", bool "gpu", int "ttempsmooth")
{
o        = input
strength = default( strength, 2.0)
gpu      = default( gpu     , false)

ttempsmooth = default( ttempsmooth , 0)


smoothed = (gpu == true) ? o.fft3dgpu(precision=0, bw=6, bh=6, ow=3, oh=3, plane=0, bt=1, sigma=strength)
 \ .fft3dgpu(precision=0, bw=216, bh=216, ow=108, oh=108, plane=0, bt=1, sigma=strength/8, sigma2=strength/4, sigma3=strength/2, sigma4=strength) :
 \ o.fft3dfilter(bw=6, bh=6, ow=3, oh=3, plane=0, bt=1, sigma=strength).fft3dfilter(bw=216, bh=216, ow=108, oh=108, plane=0, bt=1, sigma=strength/8, sigma2=strength/4, sigma3=strength/2, sigma4=strength)



output = (ttempsmooth > 0) ? smoothed.ttempsmooth(maxr=ttempsmooth) : smoothed


return(output)
}

Mister Hatt · Post by **Mister Hatt** » Sun Jun 20, 2010 8:11 pm

mirkosp wrote:As a side note, I think Mister Hatt has a standalone fft3dgpu application which runs faster than the one in avisynth... if he's willing to share, you might be interested in it.

It's 64 bit linux only and requires ICC, I guess I could share if I can find where I committed it.

Don't use HQDN3D ever, it is insanely destructive. I think I once gave mirko a rundown on how filters like fft3d actually work, and I'll write one here so that you understand how to use the sigmas. Maybe someone will want to split and sticky, idk.

The haloing you got was due to having too strong a sharpen pass, I don't even use the sharpen feature at all.

FFT3D filter is a frequency domain filter. What that means is that a fast fourier transform of the video (to simplify it a bit and make things faster) is applied and a resulting frequency domain graph is generated. This is done with FFTW, a pretty sweet adaptive fourier transform library, however FFT3D uses quite an old version of it and without any of the really nice threading stuff for SPEED. FFT3D then applies a mask of blocks to your video, where each block overlaps a bit, and then each individual block is run over by the FFT to create the frequency domain graph.

Each block is then spectrum filtered according to your sigmas. Each sigma value corresponds to a specific 'size' of noise in your video. Sigma is for large blocks, sigma2 for smaller ones, sigma3 for things like dirt, and sigma4 for mosquito noise. You can also tweak the block sizes and planes and other things used. A good way to increase speed is to only filter the luma.

Once the graph is done, some inverse algorithms kick in and apply it to your actual image. That should cover enough for you to figure out how to use the thing. I would advise starting with low levels of sigma, 0.8 is common for me to use on most anime if it is on the heavier side of noisy. sigma=0.6,sigma2=0.4 is pretty much GOLD for contrasharpening stuff. Note that the last sigma value you define is what all subsequent values will be, so sigma=0.6 will set sigma2/3/4 as the same, while sigma=0.6,sigma2=0.4 will set sigma3/4 as 0.4 too. bt=4 is a good setting to have as well, and for full processing, plane=4.

Caroliano · Post by **Caroliano** » Sun Jun 20, 2010 9:57 pm

I tried to use bt=3 in the two stage ff3d above and it only added artfacts and didn't filtered well. Basically, my past experience with fft3dfilter that I talked about above, as I never tried to use it as an spatial-only filter.

On my source there isn't much detail left, so the high sigmas and HQND3D() makes a good clean up. In cleaner and more detailed sources, I always use MDegrain and/or other temporal only denoiser, sometimes coupled with very weak dfttest().

Some new results:
I forgot my IGP was underclocked, and now I overclocked it a bit and got 12fps on 2 stage ff3dgpu. Better, but still not enought.

Visually comparing the encoded results, the one filtered by dfttest(sigma=1.5,tbsize=1) got the better looks, followed by the HQND3D one, and with fft3d in the last place. But the difference was small. I used an low bitrate. The result probably would be different if I used CRF16.

AnimeMusicVideos.org

Dfftest so slow.

Re: Dfftest so slow.

Re: Dfftest so slow.

Re: Dfftest so slow.

Re: Dfftest so slow.

Re: Dfftest so slow.