Double release post

← Nana-get!

I don't care →

November 27, 2011

Double release post
Have you ever thought, that this is kinda unfair, when albums are being released without instrumental versions for every song, while singles usually have them? Well, I do understand, that CD capacity of 700 MBs is not enough to hold both average album (~500 MB) and the corresponding instrumentals, but why not release 2 CDs in one packaging or maybe separate, so that the price of the actual single is not affected? So far the only album that has its instrumentals released was Mizuki Nana's Ultimate Diamond and even this was a quite "special" release, not freely available for buying (more details here). I'm not taking into account 2 Minorin's Unification albums, since they are rather "unrelated" to existing albums and have quite rearranged music. One might think, that album's instrumentals are never being actually recorded or stored anywhere after album's release, unlike singles' ones, but that's not true - sometimes you can hear these instrumental versions being played in live concerts as background (i.e. not performed live by musicians on the stage), usually in credits section.

The very fact of the instrumentals being played at live concerts and the release of Unifications, as well as the awesomeness of Kikuta Daisuke's music in early Minorin's songs (that is, before yet including Parade album, when there still everything was about violins) motivated me to investigate further, whether there is something I can do to fix such an unfairness. And sure I found the way.

While watching Parade BDrip, which has 6-channel audio, I thought, that since channels are being separated by their corresponding position (e.g. Left, Right, Center, etc.), they also must have different roles and actual audio information being sent through them. Therefore, you can divide all channels in such groups as "Music", "Voice", "Hall noise and echo" and "Bass", where Music stands for Left and Right, Voice being center, Hall noise combines all "Back" and "Side" channels, if there are any, and Bass is LFE/subwoofer one.

Multichannel audio, as seen in Audacity

Since we have separate channels, containing (mostly) only music information, we can just extract it and have instrumental versions for all the songs of the concert! Isn't it simple and cool? The concept is simple, the realization, however, is not. Here you will see, what I had to do to actually provide you with the Instrumentals for Minorin's albums.

Note, that this is the first time I am working with audio, so I'm not pretending all my terms to be correct.

First of all, I needed an audio-processing software to operate tracks extracted from BD. I had a choice among aforementioned Audacity, which turned out to be good only for previewing all tracks at once, while editing was quite difficult; the Sony's Sound Forge, editor I'm used to the most, but unfortunately not able to handle multichannel audio as well as multitrack editing; and Adobe Audition, which has all the features I need, but is the least known to me. While the lack of multichannel support of Sound Forge could've been easily worked around by Avisynth (Yes, it's Avisynth again. And you thought it's only good for video?) by opening at max 2 channels simultaneously, the lack of multitrack editing made it secondary tool. You will see later, why I needed multi- capabilities even though Instrumentals are being readily available as 2-channel audio.

No matter what the audio editor is, it can't just open .mkv file of BD rip, containing both video and audio, especially when it's AAC audio. First I need to convert the audio to more compatible format, preferably the one that doesn't need further decoding, WAV. In order to do that I wrote simple Avisynth script, as follows:

directshowsource("parade-1.mkv", video=false).getchannels(1,2)

DirectShow allows for opening any mediafile supported by installed decoders, and "GetChannels" returns audio channels #1 and #2, that is, the music channels, so that we can open it in Sound Forge. Even though this script returns audio as the result, none of the tools I use are able to open .avs files, so I need to either create interface for them to provide what would look like actual .wav files, or convert the whole audio to single or separate large .wav files. While the first idea of on-the-fly conversion sounds cool and saves space, turned out that the software making it possible (Pismo file mounter, which makes AVS files look like set of AVI and WAV files) has some issues with synchronization and the resulting audio is laggy, which is unacceptable.

Pismo file mounter

Therefore the only way was to extract the audio. I had to open AVS in VirtualDub, and since it doesn't support mediafiles without video stream, we need to fake it by changing the script this way:

a=directshowsource("parade-1.mkv",video=false).getchannels(1,2)
v=blankclip(fps=1,length=X)
audiodub(v,a)

Here X is the length of the audio in seconds.

Now that we've extracted audio channels, the idea of creating instrumental versions of songs might seem obvious - just split it per song and mission complete. Yet things are far from being done. In fact, as well as the voice still being heard (though very quiet) via music channels, the music is also being distributed to central channel and by discarding it we lose a significant part of the audio image. Therefore we need to mix central channel (being #3) and music channels while still maintaining off-vocal status. Here goes the most tiresome part of the work, where you open the (mono) central channel in either SForge or AAudition and listen through the whole file, setting markers where vocals start and stop, then muting/deleting them. Time spent: 2x...3x the whole time of concert.

After finishing cleaning the central channel of vocals, it's time to mix it with music channels, and this is where we need multitrack capabilities of AAudition. Yet here goes another problem: there's an issue with Left-Right balance: while the songs from singles on CDs usually have all the instruments more or less equally distributed between left and right channels, preserving central balance, in live concerts you're most likely to have guitar dominating on the right, pianos on the left and so on, depending on their actual position on the scene. This problem is especially difficult to solve, because when you add central channel (which mostly has music from solo parts of guitar, for example) to already imbalanced stereo instrumentals track, the existing inequality greatly increases.

Instruments distribution on the scene

Additionally, unlike editing images in Photoshop, where in order to mix 2 images you can just overlay one on top of another and set opacity to 50%, audio waves rather "stack up" instead of mixing. That is, the overall loudness increases as you add tracks and is very likely to go beyond acceptable limits, thus producing clipping and distortions.

In fact, there's barely anything you can do with these issues other than manually changing volume and balance for each and every part of the whole audiofile. So here we go relistening everything again.

Compilation-ready project

Here you can see the result. Green line is the volume envelope and blue line is the balance curve. Since most of the solo channel consists of guitar sounds and guitar is located at the right half of the scene, I had to shift the balance to the left every time there's sound on the solo track. As for the volume, I basically lower it in the music channel and gradually higher and lower in the solo channel, providing smooth attack and release. I tried to max the quality using 32-bit mixing mode even though the source and destination would be 16-bit. No idea if there's any audible difference. Still there was noticeable shifting of stereo field at the beginning and the end of every peak at the solo track, which later I found could be masked by expanding stereo image on the edges (where it shifts to the right) and narrowing it in the middle (where I shifted it to the left). The resulting audio still looks clipped though, but most of the time clipping occurs on drums, so that it doesn't really sound distorted, while still preserving enough loudness.

Possible clipping area

Finally the most tiresome part of the work is done, yet still the remaining one is not as simple as it might seem. All we need to do is split the compiled stereotrack into separate songs, but remember, that this is not your average CD-album with noticeable gaps between songs and easily distinguishable beginnings and ends of every track, this is live concert. Songs can start "suddenly", before previous one's instrumentals stop, thus overlapping and making it impossible to clearly split them. Moreover, hall noise (especially ovations) sometimes gets so loud that it becomes pretty audible even on music channels.

Last Arden and Peace of Mind intersection

This is where AAudition shows it's true power. It has a possibility of "memorizing" part of sound wave and then either removing everything similar to it from other wave or removing everything that doesn't look like remembered wave. Usually it is used for noise removal, where you select part of the wave with only noise being heard and having that as reference remove similar noise everywhere it is encountered along the file. But this feature is capable of way more sophisticated effects and processing. Here, however we have different task - to separate two songs one from another. While I can't exactly explain how it works, it is similar to noise removal concept in image processing, where you also need to teach the program what is considered to be "noise", and it is based on the frequency analysis by FFT. Therefore in order to better visualize what we're going to do, I'm switching to Frequency Display.

Frequency image of intersecting waves

As you can see, we have 2 kinds of "elements" here: horizontal and vertical. The first song ends with violins, so horizontal waves represent their frequencies. The second begins with pianos, and they are depicted as vertical comb-like lines. We need to separate them to divide songs without any loss. I think you've already got the idea. Yes, first we take the part before intersection (where only violins present) as an example and remove everything similar to them from the beginning of the second song. Then we do vice versa - take the rightmost of the wave on the image as reference for piano sounds removal from the end of the first song. In order to be sure, we can also switch to "Leave only example-similar sounds" mode and process intersection again, this time with corresponding examples from the same song. Eventually you will get this result:

Songs being properly divided

It might be a good idea to apply an equalizer filter afterwards to additionally remove unwanted frequencies, but only if they differ enough. Unfortunately there is no smooth transition between unaffected and processed areas, so you should be careful when selecting audio to process, or maybe implement such transition manually.

Same applies to hall noise. While it doesn't have any particular frequency, but rather has the form of random noise, it still is possible to remove it using the Noise Removal tool.

This is where we can say, that the work is done. You might still want to apply some cosmetics to songs, maybe increasing volume of too quiet acoustic-like songs, since again, in CD releases loudness of the song seem to be way less varying from it's average value, than in live performances.

While the overall result and excitement from the fact of the very (now) existence of manually created instrumental albums is quite cool and satisfying a lot, there's always a fly in the ointment. Even though it doesn't affect what's been done, it still is quite disappointing. The fact is that it is impossible to create any more instrumental versions for other Minorin's albums, or at least to make Contact complete. The reason is that the audio in all BDs of her concerts after 2009 seem to be wrong and incompatible with the method I used. Basically, while still being 6-channel, the audio in both Summer Camps and Sing All Love has Vocals channel mixed with the Music one, thus effectively making it impossible to use them as instrumentals source. I can't think of any reason for Lantis to FAIL that much, even though it's been a while already since Lantis became an epitome of Fail. Well, to be honest, I checked only BD-Rips of the aforementioned concerts, so, theoretically speaking, this could be ripper's mistake. But then I downloaded Summer Camp 1 ISO and the FAIL is still there, thus giving reason to think, that it is present in all other releases as well. Alas.

Lantis being itself

Oh well, I never really liked stuff they did after Parade. But the fact, that I will never have Contact completed saddens me a lot.

Anyway, let's move to the best part of this blog entry - the Sharing part.
The Parade Instrumental collection contains all 14 songs from it's original version + 2 additional compositions: Intro and Interlude. These are really good.

Download link: http://www.mediafire.com/?827sca4crlc0zss 149 MB

The Contact one, however, has only 1/2 of all songs available as instrumentals. Though I'd never get enough of it's violins and pianos anyway. List of the songs:
- 02 Shijin no Tabi
- 03 Futari no Reflections
- 04 Junpaku Sanctuary
- 06 Cynthia
- 08 Too late Not late
- 11 Kimi ga kureta ano Hi
Download link: http://www.mediafire.com/?kr52b9hbn8cf49d 58 MB

There's also a bonus part with some additional songs as well as FLAC versions of all tracks, but to obtain that you'll need to make your way to the very heart of chinese Minorin's fanclub, http://chiharaminori.cn. Fear not, though, the English knowledge is pretty much enough to get along well with the inhabitants.

As I promised in the title this will be double sharing post, and not only double in terms of the amount of albums created, but also meaning that I will share something other than audio. That'll be pics. A lot of awesome pics you won't see everyday. Look at the first image in the post, do you like it? Do you like such 3D->2D conversions / reverse cosplay / seiyuu-tans as much as I do? Then have 2 (double release!) packs of these pictures, one for Minorin and one for Nana-tan! After all, if there's something better than Seiyuuphotos, it's their 2D-representation.

Minorin, 82 pics, 23 MB: http://www.mediafire.com/?lapakh6uitdfa99

Nanami, 175 pics, 70 MB: http://www.mediafire.com/?49srg7mpw1sk4sb
Ah yes. Guess who doesn't fail? King Records.

← Nana-get!

I don't care →

Comments (5)

According to the response I got from Hydrogen Audio forums, it is impossible to extract music data from BDs with wrong soundtrack, moreover, they can't really be called "wrong" as the distribution I consider "right" is not really common, despite being logical.

Alas.
- 2/25/2012 9:15 PM
- ChitOKun
wow, thank you so much ChitOKun, your dedication to perfection really impressive
can't say much but, ありがとうございました
- 3/13/2012 2:04 PM
- Prim`s
@Prim`s - Thank you very much for feedback! I consider this being one of my top works so far.
- 4/28/2012 2:00 PM
- ChitOKun
Update: Now available for download as torrent @ http://www.nyaa.eu/?page=torrentinfo&tid=321933

Note that for the sake of world recently going IPv6, I included IPv6 tracker in this torrent along with others. This might bring more seeds.
- 6/10/2012 2:23 PM
- ChitOKun
@Michael Broxterman@facebook -

Not to disappoint you, but I think my case was more about luck than rule. As I mentioned, out of all BDs from various labels only this one had proper audio suitable for this method.

You might be more luckier though, do give it a try.

There are more options for making off-vocals like "vocal remove" in Adobe Audition that don't require multichannel audio, but they will never achieve such level of quality as this one does.
- 6/10/2012 5:02 PM
- ChitOKun

ChitOKun's Xanga Site

November 27, 2011

Comments (5)

Post a Comment

Leave a Reply Cancel reply

ChitOKun

Recent Posts

Recent Comments

Categories