We present Subtractive Training, a simple and novel method for synthesizing individual musical instrument stems given other instruments as context. This method pairs a dataset of complete music mixes with 1) a variant of the dataset lacking a specific stem, and 2) LLM-generated instructions describing how the missing stem should be reintroduced. We then fine-tune a pretrained text-to-audio diffusion model to generate the missing instrument stem, guided by both the existing stems and the text instruction. Our results demonstrate Subtractive Training's efficacy in creating authentic drum stems that seamlessly blend with the existing tracks. We also show that we can use the text instruction to control the generation of the inserted stem in terms of rhythm, dynamics, and genre, allowing us to modify the style of a single instrument in a full song while keeping the remaining instruments the same. Lastly, we extend this technique to MIDI formats, successfully generating compatible bass, drum, and guitar parts for incomplete arrangements.
In these examples, we take a full-mix song, remove the drums, and add in our own drum using our drum-insertion model. The drum insertion is guided by a text prompt.
Original Song | Remove Drums | Add Drums Using Our Diffusion Model | |
1 |
Add rock drums |
||
2 |
Add indie drums with punchy beats |
||
3 |
Add reggae beats |
||
4 |
Add soft acoustic drums to enhance emotion |
||
5 |
Generate percussion with a lively Latin flair |
We use our model to edit the drum stem of an existing song using a text prompt and substantially change its style. For instance, we can take reggae song, remove its drums, and replace them drums with `jazz drums'. This shows our method's ability to perform stem-wise editing, and combine genres in interesting ways.
Original Song | Remove Drums | Add Drums Using Our Diffusion Model | |
1 |
Add jazzy drums |
||
2 |
Add reggae beats |
||
3 |
Add aggressive rock drums with cymbal crashes |
To show the extensiveness of our subtractive training paradigm, we show that we can also do MIDI generation. We start with a MIDI file that is one instrument only, and generate another instrument on top of it.
Instrument | Input MIDI | Added Instrument | Output MIDI | |
Bass and Drums | Adding Guitar | |||
Guitar | Adding Bass | |||
Drums and Guitar | Adding Bass | |||
Guitar and Bass | Adding Drums | |||
Drums and Bass | Adding Guitar | |||
Drums | Adding Guitar |