Q&A: Query-Based Representation Learning for Symbolic Music re-Arrangement

Authors: Jingwei Zhao, Gus Xia, and Ye Wang


Q&A (Query and re-Arrange) is a query-based system for multi-track symbolic music rearrangement, which aims to reconceptualize a musical piece while conserving its essential content. Such rearrangement scenarios include orchestration, piano cover generation, re-instrumentation, and voice separation, which are all common tasks in music practice. Inspired by content-style disentanglement and transfer, Q&A is self-supervised to tackle each mentioned task with one unified framework. In our case, we define style to be the textural functions of individual tracks and content to be the overall harmonic and melodic structure. We show non-cherry-picked generation samples in this page to demonstrate the rearrangement performance of Q&A. For more technical details about our work, refer to our paper, GitHub repo, and Colab tutorial. This page is developed with the html-midi-player library and is best to be viewed on a Google Chrome browser.

Q&A is published at IJCAI 2023 special track for AI the Arts and Creativity.

Orchestration

Given piano clip x and multi-track clip y, Q&A can orchestrate x using the style of y. The following tabs show the orchestration results for 5 piano clips each with 3 heuristically sampled donors of multi-track style. Refer to Eq. (8) in our paper and see how the heuristic sampling works.

1/15

2/15

3/15

4/15

5/15

6/15

7/15

8/15

9/15

10/15

11/15

12/15

13/15

14/15

15/15

Piano Cover Generation

Given multi-track clip x and piano clip y, Q&A can generate piano cover of x using the style of y. The following tabs show the piano cover generation results for 5 multi-track clips each with 3 heuristically sampled donors of piano style.

1/15

2/15

3/15

4/15

5/15

6/15

7/15

8/15

9/15

10/15

11/15

12/15

13/15

14/15

15/15

Re-Instrumentation

Given multi-track clips x and y, Q&A can rearrange x using the instruments, textures, and voicing (i.e., style) of y. The following tabs show the re-instrumentation results for 5 multi-track clips each with 3 heuristically sampled donors of style.

1/15

2/15

3/15

4/15

5/15

6/15

7/15

8/15

9/15

10/15

11/15

12/15

13/15

14/15

15/15

Voice Separation

Voice separation is a special case of orchestration, where we only aim to separate a solo (mixture) into individual voicing tracks without any creative factor. Q&A is capable of voice separation by inferring individual track functions from the mixture itself instead of referring to external style donors. The following tabs show voice separation results for 5 Bach chorales and 5 string quartets, each with a preset voice number of 4.

1/10
BWV 36.4-2

2/10
BWV 122.6

3/10
BWV 281

4/10
BWV 324

5/10
BWV 403

6/10
Haydn Op. 64 No. 5 I

7/10
Mozart K. 465 I

8/10
Mozart K. 465 II

9/10
Beethoven Op. 135 III

10/10
Beethoven Op. 18 IV


Summary

To sum up, Q&A is a novel query-based framework for multi-track music rearrangement. The main novelty lies first in our application of a style transfer methodology to interpret the general rearrangement problem. By defining and utilizing track functions, we effectively capture the texture and voicing structure of multi-track music as composition style. Under a self-supervised query system, the number of tracks and instruments to rearrange a piece is virtually unconstrained. Q&A serves as a unified solution for orchestration, piano cover generation, re-instrumentation, and voice separation. Experiments show that it can both creatively rearrange a piece and faithfully conserve the essential structures.

For more technical details, refer to our paper, GitHub repo, and Colab tutorial.