August 22, 2011

  • Tabidachi no Uta PV by Hekiru Shiina

    I have been looking for this music video for quite a long time, and nearly desperated to find it, as there was no trails of it even on Perfect Dark, not to mention eMule and BT. There was only low-res version of it on Youtube. Fortunately, Bardik filled my request on Jpopsuki.eu, and what's funny, he got this PV from JPS 1.0. Still, despite the higher resolution, this TV-rip of PV has quite a lot of issues common for mpeg2 video: lots of blocking and dot-crawl artifacts, especially on dark areas of video, which had rainbow tints all over them, low contrast (TV/PC levels) and low sharpness. But now, when I have high-resolution version of video, I can try and recover quality no matter how bad it is. (Worth mentioning how someone tried to fill my request on JPS earlier with smth like upscaled from 320x240 youtube version. Remember, having LQ high-res you can recover quality, but having HQ low-res or blurred video you won't recover resolution (for live video) and details (for anime).)

    So, what do we have on the source unprocessed PV? 

     

    As you can see, a LOT of blockiness and greenish tint on dark areas, which aren't even dark enough. Picture also lacks clarity and sharpness. 

     

    To fix that, I wrote this Avisynth script (contracted significally, actual was much longer):

    mpeg2source("Hekiru Shiina - Tabidachi no Uta (PV).d2v",cpu2="ooooxx") # opening source file using deringing filter only, as next line will fix blocking better

    Deblock_QED(30,30) # pre-made script from Avisynth website. Seems to be good for general use.

    crop(0,(480-396)/2+16,0,-(480-396)/2-14).bicubicresize(832,482).crop(0,2,0,0) # Upscaling to 832x480. I cropped horizontal borders as they contain only noise and generally look bad.

    a=coloryuv(levels="tv->pc")

    overlay(a, opacity=0.5) # fixing wrong TV/PC levels. As I already mentioned in previous entry, main difference between TV scale and PC scale is that TV has cropped lightness range, so that instead of full (black)0-255(white) we have (dark grey)16-235(light grey). But scaling TV to PC somehow always looks overcontrasted to me, so I scale it by 50%.

    super= MSuper(pel=2, sharp=1)
    backward_vec1= MAnalyse(super, isb= true, delta= 1, overlap= 4,search= 5)
    forward_vec1= MAnalyse(super, isb= false, delta= 1, overlap= 4,search= 5)
    MDegrain1(super, backward_vec1,forward_vec1,thSAD=300)# Motion-adaptive temporal denoiser. Though it is usually not recommended to use temp smooth on video with a lot of motion like this one, it allowed me to clean edges of remaining blocks. Also I used MDegrain3 with 3 pairs of vectors.

    gradfun2db(1.6) # fixing color blockiness and rainbow tints.

    sa=tweak(sat=1.15).unfilter(35,45)
    sb=tweak(sat=0.7).blindpp(cpu2="xxxxoo",quant=8)
    m=tweak(cont=1.25,bright=-15).mt_binarize(40).mt_inflate
    overlay(sb,sa,mask=m) # this is how I fix colors flowing beyond object's edges: I increase saturation for light areas and decrease it for dark ones, also strongly deblocking them as they don't have details anyway. Now we have way more nice-looking image. Also, this is how people's eyes see things: we don't see colors in dark areas, since color receptors turn on only when there's enough amount of light.

    sharpen(0.35)

    blockbuster(method="noise") # again, altering encoder to pay more attention to dark areas so that it won't recreate issues we've just fixed.

    tweak(hue=7) # tweaked image's hue towards red tint, since it looks kinda too greenish and unnatural for me.

     Second part of the script, here I improve sharpness.

    sm=(WarpSharp) # let's say it is warpsharped version of video here. Actually I had to create physical copy of video, processed with virtualdub's WS filter, but there seem to be avisynth's version of it as well. Here is what WS looks like and why it is usually only used for anime: link.

    sh=last

    m=mt_edge(sh.converttoyv12,mode="min/max",thy1=55,thy2=70).mt_inflate

    m2=mt_edge(sh.converttoyv12,mode="min/max",thy1=70,thy2=95).mt_inflate

    a=sh.overlay(sm,mask=m2).sharpen(0.2)

    b=sm.overlay(sh,mode="darken") # no idea why I used darken instead of lighten...

    x=overlay(a, b.sharpen(0.2),opacity=0.5) # err... I don't really remember what is this all about, but seems like the idea was to fix lack of sharpness without oversharping and creating halos. I used low-sharpness version of image for edges (which are already sharp) and warpsharped one for details inside edges (actual objects). Then I combined it using "lighten" mode (WS usually cripples image by shifting dark edges so I used only light parts of image).

    Last part, and I'm working with lightness balance and some effects.

    Isn't necessary, but I wanted to implement some kind of HDR and other cool things.

    At first glance, HDR-effect (I mean, glow near edges of dark objects on light background) could be relatively easy implemented by overlaying blurred copy of image over dark areas in "lighten" mode, but it creates such issues as lack of contrast on small dark objects (like lines and buttons). Also edge between light and dark is not 100% exact and clear.

    msk=mt_binarize(55).mt_invert.mt_inflate # selecting dark areas

    m=msk.mt_deflate(x5).inpand(x10).blur(1.58)(x20).yv12convolution("1 2 3 4 5 6 5 4 3 2 1", "1 2 3 4 5 6 5 4 3 2 1").mt_deflate(x10) # since there is no simple gaussian blur filter with large radius, you have to combine a lot of small radius ones.

    overlay(a.tweak(bright=-4,sat=0.65),mask=m) # I tried to further decrease brightness and saturation inside dark objects to make them look masked by light source on the background.

    z=last

    msk0=mt_binarize(60).mt_deflate(x3).mt_expand(x5).yv12convolution("1 2 3 4 5 6 5 4 3 2 1", "1 2 3 4 5 6 5 4 3 2 1")

    overlay(Levels(0,1.1, 255, 0, 255,coring=false),mask=msk0,opacity=0.83) # increasing brighness of light areas and a bit inside dark areas near the edges. Should be careful not to move beyond 0-255 range. Also instead of blurring image I blurred mask. It's easier and makes less troubles for small details, but also makes HDR look less cool.  

    Here is what I got as the result:

    (left is "before" and right is "after") 

     

     

      And this is how video looked like unprocessed: 

     

    Avisynth rules.

    Summary

    Issues still present on the picture:

    1. Lack of detail.

    Common issue while processing noisy and blocky images, since it removes details together with noise. Might have been fixed by using more precise masks, still pretty difficult for such LQ source.

    2. Wrong framerate and messy frames.

     Again common issue for TV broadcasts since the actual video seem to be recorded at 23.976 frames per second, while TV uses 29.97 fps. That wouldn't be a problem, just every 4th frame gets duplicated, but silly tv operators add TV logo and text not only between letterbox borders and picture, but also out of sync with source video. Low quality MPEG compression adds ghosting artifacts which makes duplicated frames differ. All that makes recovering framerate very difficult.

    3. Wrong (?) brightness and contrast.

    Might be my mistake. It's always a trouble for me since I use monitor profile with increased contrast and decreased brightness as my monitor seems to be too bright. Unfortunately it has no effect on video, so that when I watch it via player I see different image from when I compare frames in photoshop or when I watch it on youtube. I think I overcontrasted PV. Also it seems to lack colors on dark areas due to my method of fixing color overflow.

    4. Low vertical resolution 

    Usual for TV and DVD. While I could've used technique from previous entry for recreating it, it requires sharp enough source. Here it would've only smooth image vertically. Also, it needs proper deinterlacing and seems like this particular rip had it totally messed up.

     DovvnIoad Iink: http://www.mediafire.com/?657q7id34z40ddj

    Bonus part: How I created eye-recognition script or why masktools are awesome.

    As I already said, HDR effect introduces lack of visibility for small dark objects on light surface, and most noticeable objects of this kind are eyes. So I need to fix that afterwards. The only way I see it is to select somehow those areas and improve their contrast and sharpness. But how do I create mask that will detect eye-like objects?

    Some might say that recognition algorithms like ones used in facebook for face detection or in your mobile camera that detects smiles to make photo - they all need some kind of imagebase to compare what they see with what is considered a face or a smile. This is not my case. I decided to define what distinguishes eyes from other objects on screen.

    First of all, brighness range: it's a dark object on a light surface, but not as dark as black boxes on background, for example. Then, it has specific shape and edges: two close edges of eyelashes connected by pupil. This makes it different from small solid dark button-like dots on image and long, but thin edges aroung objects. And the last, it has it's own color range - brown. So, what we need is 3 masks combined for every feature.

    Here's the reference image I used while building mask:

     

    To make it easier to demonstrate how masks work I tried to depicture color levels.  Let's say we extracted a row of pixels from the image below (shown as green A-B line).

     

     Then we can assume, that there is a function graph here, something like Y(X)=Lightness(X), that is, the higher the graph gets - the brighter the corresponding pixel in image is.

    With enough imagination you can see here (from left to right): hair, skin and eyelashes with pupil between them, then skin again. So, how do we select eye only?

    Here is the script:

    mask1=mt_logic(mt_binarize(75).mt_invert, mt_binarize(59),mode="and") # This selects brightness range between 59 and 75.

    and the corresponding mask:

     

    mask2=mt_edge(mode="hprewitt").mt_inflate.mt_inpand(x10) # This selects edges of the picture. To eliminate useless line-like edges, I shrink the mask until they disappear while thicker objects stay intact.

    Edges here are rapid direction change of the graph. We select most rapid only, and only areas with lots of changes nearby.

    Mask:

     

    By now we control selection in 2 dimensions. But there's the 3rd one - color.

    mask3=maskhs(coring=false,starthue=114,endhue=130,minsat=23,maxsat=31) # And this selects by color=brown.

     

     

    Result - combining all masks and selecting only what is present on each mask.

     

    As you can see, it still selects stuff other than eyes, but you need to keep in mind, that while watching most of the attention is paid to performer's face, so you just won't notice if there's something wrong on far corner of the screen.

    More examples of script's work:

     

    Sometimes it even works on close-up shots

     And increased brightness

     

    But certainly won't work for different colors.

    Still, I think it is amazing that you can perform such a task - image analysis - using relatively simple script language like Avisynth. Image analysing tasks are considered to be quite difficult for machines.

Comments (1)

Post a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *