These properties are similar to an artist’s palette on which she can explore and blend color options for a painting, and much like a palette, these properties can enhance creativivty.
Smoothness: Examples from nearby points in latent space have similar qualities to one another.
Realism: Any point in this space represents some realistic example, including ones not in the training set.
Expression: Any real example can be mapped to some point in the latent space and reconstructed from it.
The desirable properties of a latent space can be summarized as follows: Furthermore, when compressing the space of the dataset, latent space models tend to organize it based on fundamental qualities, which clusters similar examples close together and lays out the variation along vectors defined by these qualities. This means that they can also reconstruct real examples with high accuracy. We also include the synthesized audio for one of the samples.Īside from excluding unrealistic examples, latent spaces are able to represent the variation of real data in a lower-dimensional space. The vertical axis represents the notes on the piano and the horizontal axis represents time in 16 th note steps. For example, here we show “pianorolls” of samples randomly chosen from the 90 32 possible 2-bar sequences. If we extend this to 16 bars, it will be 90 256 possible sequences, which is many times greater than the number of atoms in the Universe!Įxploring melodies by enumerating all possible variations is not feasible, and would result in lots of unmusical sequences that essentially sound random. If we ignore tempo and quantize time down to 16 th notes, two measures (bars) of music in 4/4 time will have 90 32 possible sequences. We can represent this as 90 types of events (88 key presses, 1 release, 1 rest). At any given time, exactly one of the 88 keys can be pressed down or released, or the player may rest. For example, consider the space of all possible monophonic piano melodies. Musical sequences are fundamentally high dimensional.
Hear more examples in the paper’s online supplement and this YouTube playlist.
View the Tensorflow and JavaScript implementations in our GitHub repository.
Sample and interpolate with all of our models in a Colab Notebook.
Project magenta google github how to#
Learn how to use the JavaScript implementation in your own project with this tutorial.
Play with MusicVAE’s 2-bar models in your browser with Melody Mixer, Beat Blender, and Latent Loops.
Read the technical details of the model architecture in our arXiv paper.
In an effort to make it as easy as possible to build usable tools with MusicVAE, we are also releasing a JavaScript library and pre-trained models for doing inference in the browser.Ĭontinue reading to learn more about this technology, or check out these additional resources: As a creative tool, the goal is to provide an intuitive palette with which a creator can explore and manipulate the elements of an artistic work.Įxamples of latent space models we have developed include SketchRNN for sketches, NSynth for musical timbre, and now MusicVAE: a hierarchical recurrent variational autoencoder for learning latent spaces for musical scores. The technical goal of this class of models is to represent the variation in a high-dimensional dataset using a lower-dimensional code, making it easier to explore and manipulate intuitive characteristics of the data. These desires have led us to focus much of our recent efforts on what are known as latent space models. On the Magenta team, we often face conflicting desires: as researchers we want to push forward the boundaries of what is possible with machine learning, but as tool-makers, we want our models to be understandable and controllable by artists and musicians.