Dfftest is already multithreaded (thread=0 by default, that means automatic detection). I don't think MT can make it faster.
The videos are some old xvid encodes w/o much detail and quite a bit ringing and blocking. I wanted some stronger filtering, not temporal-only filters like ttempsmooth that don't make visible changes (but yeah, it is very good).
I never got very well at fft3dfilter/fft3dgpu. I tried it once and all I could do was create new ringing and blocking. Any sugestions on fft3d configuration from where I can tweak the strenght? I will try the other filters in the mean time.
Dfftest so slow.
- mirkosp
- The Absolute Mudman
- Joined: Mon Apr 24, 2006 6:24 am
- Status: (」・ワ・)」(⊃・ワ・)⊃
- Location: Gallarate (VA), Italy
- Contact:
Re: Dfftest so slow.
Two stage fft3d is godly.Caroliano wrote:Any sugestions on fft3d configuration from where I can tweak the strenght?
Code: Select all
strength = X
fft3dfilter(bw=6, bh=6, ow=3, oh=3, plane=0, bt=1, sigma=strength)
fft3dfilter(bw=216, bh=216, ow=108, oh=108, plane=0, bt=1, sigma=strength/8, sigma2=strength/4, sigma3=strength/2, sigma4=strength)
As a side note, I think Mister Hatt has a standalone fft3dgpu application which runs faster than the one in avisynth... if he's willing to share, you might be interested in it.
-
- Joined: Sat Aug 05, 2006 5:47 pm
Re: Dfftest so slow.
Results: My previous filtering was only Dfttest(sigma=1.5, tbsize=3), that runs at 6,8fps (ab)using the four cores. The 2 stage fft3d is really godly, in some scenes better than dfttest(). But the one you posted runs at 6,2fps, using only an single core, and if I use fft3dgpu instead, it goes 7,2fps, on my weak HD3300 (onboard on 790GX chipset). So, even if in another hardware configuration it would be significantly faster than dfttest, for me it don't quite cut it. But I will be using it in my 1fps scripts.
Nevertheless, that gave me the idea of using an spatial only dfttest: Dfttest(sigma=1.5, tbsize=1) runs at 20~22fps, much better. If I add ttempsmooth(maxr=3) after it, the encode becomes bottlenecked in one of the cores, and the speed goes down to 13fps.
Last, HQND3D() is fast, but don't like DCT blocks, so I used deblock().HQND3D() and it ran at 60fps. Now, that is fast. The result of course is not as good as the the slower ones, but I guess it is ok for the speed.
Oh, I made an simple .avsi for the two stage fft3d, like I did for MdegainX(), only to make easier to use w/o lots of copy-paste. Here it is:
Nevertheless, that gave me the idea of using an spatial only dfttest: Dfttest(sigma=1.5, tbsize=1) runs at 20~22fps, much better. If I add ttempsmooth(maxr=3) after it, the encode becomes bottlenecked in one of the cores, and the speed goes down to 13fps.
Last, HQND3D() is fast, but don't like DCT blocks, so I used deblock().HQND3D() and it ran at 60fps. Now, that is fast. The result of course is not as good as the the slower ones, but I guess it is ok for the speed.
Oh, I made an simple .avsi for the two stage fft3d, like I did for MdegainX(), only to make easier to use w/o lots of copy-paste. Here it is:
Spoiler :
-
- Joined: Tue Dec 25, 2007 8:26 am
- Status: better than you
- Contact:
Re: Dfftest so slow.
It's 64 bit linux only and requires ICC, I guess I could share if I can find where I committed it.mirkosp wrote:As a side note, I think Mister Hatt has a standalone fft3dgpu application which runs faster than the one in avisynth... if he's willing to share, you might be interested in it.
Don't use HQDN3D ever, it is insanely destructive. I think I once gave mirko a rundown on how filters like fft3d actually work, and I'll write one here so that you understand how to use the sigmas. Maybe someone will want to split and sticky, idk.
The haloing you got was due to having too strong a sharpen pass, I don't even use the sharpen feature at all.
FFT3D filter is a frequency domain filter. What that means is that a fast fourier transform of the video (to simplify it a bit and make things faster) is applied and a resulting frequency domain graph is generated. This is done with FFTW, a pretty sweet adaptive fourier transform library, however FFT3D uses quite an old version of it and without any of the really nice threading stuff for SPEED. FFT3D then applies a mask of blocks to your video, where each block overlaps a bit, and then each individual block is run over by the FFT to create the frequency domain graph.
Each block is then spectrum filtered according to your sigmas. Each sigma value corresponds to a specific 'size' of noise in your video. Sigma is for large blocks, sigma2 for smaller ones, sigma3 for things like dirt, and sigma4 for mosquito noise. You can also tweak the block sizes and planes and other things used. A good way to increase speed is to only filter the luma.
Once the graph is done, some inverse algorithms kick in and apply it to your actual image. That should cover enough for you to figure out how to use the thing. I would advise starting with low levels of sigma, 0.8 is common for me to use on most anime if it is on the heavier side of noisy. sigma=0.6,sigma2=0.4 is pretty much GOLD for contrasharpening stuff. Note that the last sigma value you define is what all subsequent values will be, so sigma=0.6 will set sigma2/3/4 as the same, while sigma=0.6,sigma2=0.4 will set sigma3/4 as 0.4 too. bt=4 is a good setting to have as well, and for full processing, plane=4.
-
- Joined: Sat Aug 05, 2006 5:47 pm
Re: Dfftest so slow.
I tried to use bt=3 in the two stage ff3d above and it only added artfacts and didn't filtered well. Basically, my past experience with fft3dfilter that I talked about above, as I never tried to use it as an spatial-only filter.
On my source there isn't much detail left, so the high sigmas and HQND3D() makes a good clean up. In cleaner and more detailed sources, I always use MDegrain and/or other temporal only denoiser, sometimes coupled with very weak dfttest().
Some new results:
I forgot my IGP was underclocked, and now I overclocked it a bit and got 12fps on 2 stage ff3dgpu. Better, but still not enought.
Visually comparing the encoded results, the one filtered by dfttest(sigma=1.5,tbsize=1) got the better looks, followed by the HQND3D one, and with fft3d in the last place. But the difference was small. I used an low bitrate. The result probably would be different if I used CRF16.
On my source there isn't much detail left, so the high sigmas and HQND3D() makes a good clean up. In cleaner and more detailed sources, I always use MDegrain and/or other temporal only denoiser, sometimes coupled with very weak dfttest().
Some new results:
I forgot my IGP was underclocked, and now I overclocked it a bit and got 12fps on 2 stage ff3dgpu. Better, but still not enought.
Visually comparing the encoded results, the one filtered by dfttest(sigma=1.5,tbsize=1) got the better looks, followed by the HQND3D one, and with fft3d in the last place. But the difference was small. I used an low bitrate. The result probably would be different if I used CRF16.