Use these tips to get more precise, consistent results when generating AI music. By clearly specifying musical details, like tempo, genre, mood, era, and instrumentation, you help the model better understand your creative intent and produce tracks that match your vision.
The Core Prompt Formula
For the highest quality output, include these four pillars in your text prompt: [Genre/Era] + [Instrumentation] + [Mood/Tempo] + [Vocal Profile]
- Genre & Era: Be specific (e.g., "1990s skate punk" or "2000s K-pop with a Motown edge").
- Instrumentation: List specific textures (e.g., "distorted bassline," "80s synth," or "acoustic piano").
- Mood & Tempo: Define the energy (e.g., "high energy, fast drums" or "melancholy, slow ballad").
- Vocal Profile: Specify gender, range, and timbre (e.g., "breathy female soprano" or "gritty male baritone")
Suggest a style direction
Referencing a style direction can guide arrangement, instrumentation, and mood. Focus on describing the musical qualities you want rather than copying a specific melody or artist.
Example:
- “In the style of an upbeat indie pop anthem.”
Include a specific decade
Decade cues shape production style and sound palette. This helps evoke recognizable era aesthetics.
Examples:
- 1990s → grunge guitars, distorted guitars, boom-bap percussion
- 1980s → Gated reverb, FM synthesis, drum machines
- 1970s → live band warmth, analog saturation, dry drums
Ask for specific instruments
Naming instruments steers arrangement and texture
Example:
- Piano, trumpet, strings
Combine core instruments with descriptors for more precise timbre.
Examples:
- Bright: High-end clarity for pop or dance
- Distorted: Gritty textures for rock or industrial
- Airy: Breathbound woodwinds or spacious pads for ambient
- Lo-fi: Muffled, nostalgic, or "dusty" textures
Write an exact tempo
Specify BPM (beats per minute) to control the track’s pacing and energy. This helps the model match the intended groove and makes the result easier to sync with video or edits.
Examples:
- 72 BPM: Ideal for calm, cinematic, or ballad-style tracks
- 120 BPM: Standard for "four-on-the-floor" pop and house music
- 140 BPM+: Best for high-energy electronic, trap, or drum and bass
Use an image as a reference
Upload images or videos to serve as "musical anchors." The model analyzes visual cues to determine the sonic atmosphere.
Turn Visual Ads into Audio
If you have a banner or promotional image, you can transform it into a matching soundtrack.
- Great for adding background music to digital ads
- Helps reinforce branding through sound
- Quick way to prototype audio for campaigns
Mood-Based Music Generation
Use images as a mood reference instead of writing detailed prompts.
- A dark alley scene → gritty, old-time jazz
- A sunny beach photo → light, tropical music
- A futuristic city → electronic or synth-based sounds
This is especially helpful when you know the vibe you want but don’t want to describe it in words.
Fun & Personal Creations
You can also use images for playful or personal music generation.
- “Make a song about my friend” using their photo
- Generate music inspired by outfits, expressions, or settings
- Create unique, personalized audio content for sharing
Need some more help? Head back to our Help Center.