What's new
LiteRECORDS

Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

  • Guest, before your account can be reviewed you must click the activation link sent to your email account. Please ensure you check your junk folders.
    If you do not see the link after 24 hours please open a support ticket.

Powerful New Vocal Remover AI - Instructions

I'll add to this that I find if you're willing to get really zoomed in with it, this issue can be mitigated. For example, with low frequency vocal bleed, I have found that the offending sound is not usually right ON the bass guitar, but just a few hz above or below it, like at 290 rather than 300, you'll see a little deeper colour jagging off the flush note at 300. Removing those jagged edges tends to clean up the sound completely. Might there be some bass trapped in there too? Possibly, but it is not what I would call audible compared to the significant change in the vocal portion.

On the other end, if a hard consonant sound is left in the mix, at a high frequency, particularly when there is no cymbal in that area, there is little to no sound to conflict with it at 15kHz. Now yes, the 'air; is important. Without it you get this muted dip, but the vocal artifact will in those cases sit as its own blob of colour surrounded by the faint non-colour air. Just digging in at a really high zoom level and bringing that back so it's all level generally makes it all sound smooth while retaining the air. .. as opposed to just hacking it out in one big clump.

The same kind of philosophy can be applied to middle ranges. If there's a sharp artifact sound, bringing the heavier blob of colour down to blend with the colours around it is generally effective. There may be other instrumentation in that blob, yes, but if the volumes are equal around that space so the shape looks like it should, then I find it does sound as it should.

I mean, I don't want to take the organic nature of sound out of it and be totally mechanical, but like, the spectrum is just a digital representation of what was played. Every frequency is a note, and the different volumes of those frequencies blended together is what gives those notes the sound of being guitar, or a trumpet, or a voice. There's no way to ever be 100% accurate, like if you repaint a painting it'll never be 100% the same as the original, but you can try to be as close as possible to it. And I think that while that may not be enough for circumstances where only total purity will do, there is a matter of weighing at what point the difference becomes so undetectable that people just won't care in the context of what you're using it for.

I totally struggle with this. I mean, I get waaayyy too fine with my work a lot of the time, when my friend just skips the 10 hours of fiddling, posts the track, and everyone still loves it just as much.
 
@Anjok
Can someone please explain an example of when the stacked pt2 model would be ideally used for a conversion?
I've just been using NP and it seems to do the best all around job so far. I don't have any high pitched Steve Perry or Rob Halford songs to convert so have not tested the HP one yet.
 
Last edited:
@anjok
Can someone please explain an example of when the stacked pt2 model would be ideally used for a conversion?
I've just been using NP and it seems to do the best all around job so far. I don't have any high pitched Steve Perry or Rob Halford songs to convert so have not tested the HP one yet.

Might need to make the "a" an "A" to properly tag him with it.
 
The problem with spectral editing is taking out something usually takes something important in the same spectrum with it. It's always best to do minimal surgery for best high quality results in my opinion.
Still waiting for the vocal making side to catch up. Long ways to go. Pretty much all of the UVR vocals are washed out and unusable.

After I release my next models, I will be moving on to the vocal extractor for awhile. I think I have enough acapellas to train a pretty decent model for it. The vocal tracks will be clean just like the instrumentals are on UVR (it'll be the instrumentals that sound washed out on the vocal extractor).
 
Can someone please explain an example of when the stacked pt2 model would be ideally used for a conversion?

I would stick to Multi-Genre Model 2. It's such an improvement over NP/HP that those would be more of an 'alternative' look at anything which doesn't convert well.

Personally, I always use Stacked Model 2 to further convert every track I convert with Multi-Genre Model 2. I find that running Stacked Model 2 three times tends to be enough, though four is also fine. It is a track by track basis. Most songs it works well enough to justify using, though there are a few songs where it is not effective. It's really a case where you have to listen to the acapellas created by it to hear what it pulled out of the instrumental and decide for yourself on a case by case basis what you want to do with it.
 
I would stick to Multi-Genre Model 2. It's such an improvement over NP/HP that those would be more of an 'alternative' look at anything which doesn't convert
Personally, I always use Stacked Model 2 to further convert every track I convert with Multi-Genre Model 2. I find that running Stacked Model 2 three times tends to be enough, though four is also fine. It is a track by track basis. Most songs it works well enough to justify using, though there are a few songs where it is not effective. It's really a case where you have to listen to the acapellas created by it to hear what it pulled out of the instrumental and decide for yourself on a case by case basis what you want to do with it.

Wait, what is Multi-Genre Model 2?
I downloaded the updated models 9/1 is that the same thing?
 
Ah thank you! Got lost in the threads. Maybe pin to the first post. I'm using a MB PRO with a virtual machine of W7 64 and each song takes about 15-20 min.

Yeah. The Google Colab method is pretty sweet. Use with your browser even on Mac book pro.
 
Ah thank you! Got lost in the threads. Maybe pin to the first post. I'm using a MB PRO with a virtual machine of W7 64 and each song takes about 15-20 min.

As [MENTION=10167]NewAgeRipper[/MENTION] mentioned, you can use Google Colab, it takes no more than 1 minute to process a song. It just requires some extra steps like uploading and downloading your songs, but if you need faster conversion speeds this is the way.
 
***UPDATE***

Lite-records exclusive! This is a "sequel" sort of speak to the main Multi-Genre model released a few months ago. I haven't released this to my GitHub yet because I'm going to try for even better results and I need to come out with a new accompaniing stacked model.

This is by far the best model I've ever trained up until this point. Please let me know what you all think!

Multi-Genre Model 2 - click here

Do I stick this in the models folder in my vocalremover folder in my docs? Or do I already have this now with the new files I downloaded from the main thread?
 
Yeah. The Google Colab method is pretty sweet. Use with your browser even on Mac book pro.

Nah, I'm not a fan of Google don't trust them with any of my files in the cloud frankly. Watch the "Social Dilemma" documentary and you'll understand.
Can't wait for Anjok to release the last model though so he can tackle the vocal extraction dilemma :D
 
Do I stick this in the models folder in my vocalremover folder in my docs? Or do I already have this now with the new files I downloaded from the main thread?

You'll need to stick into the models folder.

I'm in the process of updating the GUI and releasing new models, so it'll all be bundled together soon.
 
You'll need to stick into the models folder.

I'm in the process of updating the GUI and releasing new models, so it'll all be bundled together soon.

I did and did a test track. Works well. Hopefully something can be trained with Evanescence. There song "Snow White Queen" will not devocal well for some reason.