FMOD Unity: Beat-Mapping

Unpacking our GGJ2020 rhythm game experiment

Colin Vandervort
11 min readFeb 14, 2020

This past Global Game Jam (GGJ2020), our team decided to create a rhythm game. For the project, I primarily took on the task of hooking up and creating a beat-mapping system. In this article, I’ll cover some of the decisions and reasoning that went into creating this beat-mapping system (for more thoughts on the jam itself, I’ve put together a short retrospective here).

Going into the jam, I already had a pretty good idea of how to set up simple beat detection using FMOD with Unity, so I used this as a foundational framework. Alone, this beat detection system is enough for a real-time rhythm detection system such as that used in Crypt of the NecroDancer, however my team was keen on making something in the vein of Guitar Hero or Audiosurf. For this style of game, we’d need access to two main things that don’t exist in the previous categorization:

  • Prior notice on beat events for telegraphing purposes.
  • An intuitive beat-event layout (a beat-map).

Beat Detection

First off, let’s go over building a beat detection system in FMOD/Unity that pulls bar, beat, tempo, timing and position, and marker info from an FMOD event.
Much of this section was based off of the FMOD Timeline Callbacks example.

To start, we’ll create a Unity MonoBehaviour (MusicManager class in this case), and include a few namespaces. We’ll be working with unmanaged memory using the GCHandle struct and the Marshal class so we’ll need to include the System.Runtime.InteropServices namespace (the use of unmanaged memory is necessary in order to avoid losing pointer references due to garbage collection). We’ll also need to include the System namespace to give us access to the IntPtr struct required for some of the unmanaged memory operations we’ll be performing. Finally, we’ll be using the FMODUnity namespace to give us quick access to the EventRef attribute and RuntimeManager class.

Next, let’s go over the data types we’ll be using. We’ll start by creating a static field for the MusicManager class (me), and storing the MonoBehaviour itself (this) in this field on Awake. This will allow us to access data in the MusicManager from across our project, which will be necessary (or very convenient at least) for our beat event spawning system later on (there may be better solutions for more complex projects, but for our purposes, this is sufficient).

We’ll also create a string field (music) for our music event. This should be private, as we only need access to it within MusicManager, and we’ll need the EventRef attribute to let Unity know that this is looking for FMOD Event strings specifically. We also want the SerializeField attribute to allow the private variable’s field to be exposed in the editor.

Additionally, we’ll create a TimelineInfo class and a public TimelineInfo variable (timelineInfo) that we’ll use to store and access our beat, bar, tempo, etc… info in and from. This class uses the StructLayout(LayoutKind.Sequential) attribute as the data it contains interacts with unmanaged memory.

TimelineInfo will be keeping track of an int for the current bar and current beat of the music (currentBar/currentBeat), a float for the current tempo (currentTempo), an int for the current millisecond position in the song (currentPosition), a float for the song length in milliseconds (songLength), and an FMOD blittable string (wrapper) for storing the name of the most recent marker (lastMarker). We won’t be using all of these for this demonstration, but for the sake of common tangential use cases, I’ve included them anyway.

There are still a couple more class-wide data types we’ll need.
Firstly, we’ll want a GCHandle to allow us to talk back and forth between managed objects and unmanaged memory (timelineHandle).

Next, we’ll want an FMOD EVENT_CALLBACK variable (beatCallback) to track beat callback parameters with, and an FMOD EventDescription variable (descriptionCallback) to specifically capture the event length.

Now, for our first independently functional bit of code, we’ll create an FMOD EventInstance (musicPlayEvent) for tracking the music playback event. In Awake, we’ll create an instance of the music event and assign it to our musicPlayEvent instance, and then, start actual event playback by calling start on the event instance.

For this next chunk, we’ll be setting our hooks into FMOD’s DSP. To begin, we’ll assign timelineInfo a new instance of the TimelineInfo class/struct on Start (this could probably be done on Awake instead, if desired).

Following this is one of our more interesting calls. We’ll be assigning a new FMOD EVENT_CALLBACK that delegates to a static FMOD RESULT BeatEventCallback function to beatCallback (more on this in a bit).

Next, we’ll call GCHandle.Alloc on our unmanaged timelineInfo struct and save that into timelineHandle. This prevents our data from being moved in memory or disposed of by garbage collection until it is manually freed.

Now that we’ve pinned our unmanaged memory, we can access it via pointer. We’ll want to save output data from our musicPlayEvent to our timelineHandle, so we’ll call setUserData on the event, and set the handle via GCHandle.ToIntPtr().

With the user data connection set, we can now tell the musicPlayEvent to listen and reply to specific data changes with the setCallback function. For the function parameters, we’ll forward the call to our beatCallback event callback delegate, and pass in a mask that filters out everything other than ‘beat’ and ‘marker’ changes.

Now we’re covered for all our beat, tempo, and marker info, but we still need the length of our music event, and the current position. We’ll be using these to track over-all music event progress, as well as beat input accuracy.

In FMOD, event callbacks are used to track data that changes over the course of an event lifecycle. However, the length of an event is a fixed data value and as such, is built into an “event description.” To capture the event description, we simply call getDescription on the event. Then, with the description as an output from the getDescription function, we’ll call getLength, with an int output (milliseconds) that we’ll call ‘length.’

Now, because we’re not dealing with event callbacks and unmanaged memory, we can save length directly into the songLength field in timelineInfo (we could have set the output of getLength, directly to timelineInfo.songLength, except that we have a type mismatch, due to functionality elsewhere in our project).

All we have left to capture now, is our current position in milliseconds. This is handled in pretty much the same way, except that we’re now going to check and store our position on every frame using Update, instead of just once.

For this, all we need to do is call getTimelinePosition on the event instance, and because the types line up, we can store the millisecond (int) output straight into timelineInfo.currentPosition.

Now that all of our callback logic is set up, we can talk about the elephant in the room: the BeatEventCallback function.

To start us off, the the attribute tag indicates that the function should compile ahead of time, with MonoPInvokeCallback noting that it will handle unmanaged items, and of type EVENT_CALLBACK. The function itself returns a type of FMOD_RESULT, which is an FMOD error code return. For inputs, the function is looking for an Event Callback Type (beats and markers in our case), an event to read from, and a pointer to the memory location that it’s writing to.

For the logic inside the function, we begin by grabbing the RESULT of getUserData on our event instance, as well as a pointer reference (timelineInfoPtr) to this data. From here, if the result returns an error, we log it and end the function (with a RESULT.OK return, regardless of the outcome).

If our RESULT doesn’t return an error, we continue, and check that our pointer is set to a nonzero value. If this passes, we grab our timelineHandle and from there, our timelineInfo object to store event callback data in (if we make it here, some sort of event data has been found).

Finally, we check the type of the event callback data found. If a TIMELINE_BEAT is found, we store TIMELINE_BEAT data in parameter (which includes bar, beat, and tempo), and distribute those across to their respective variables in timelineInfo. If a TIMELINE_MARKER is found, we do the same, but for timelineInfo.lastmarker.

In the semi-likely event that we’ll ever want to destroy our MusicManager object, there are a few things we must make sure we include.

Using OnDestroy, we can assign a few commands to stop and release the unmanaged memory we’re using (so as to avoid memory leaks and other complications). First of all, we’ll call setUserData on our music event and zero out our pointer. We’ll also call an FMOD stop event, and in this case, set the stop mode to ‘immediate” (it doesn’t have to be immediate, fades could be allowed if desired). Finally, we’ll call release on the event instance to mark it for destruction, and call Free on our timelineHandle to allow our unmanaged memory to be garbage collected.

As a final note before moving on from our beat detection framework, if you’d like to preview or debug the timelineInfo data in edit, OnGUI can be a really solid solution.

Building Note Events in Unity

For the note events, I worked heavily with Amorphous (Check their work out, they make cool game and non-game things). In fact, they did most of the programming here, so I’ll stick largely to the conceptual details.

We made the initial design assumption that we’d work with only one song, but have multiple tracks for beat events to approach on within this song. We also wanted preemptive notice (a visual ‘tell’ — as hinted at above) on when beats should land and grace window surrounding the ideal hit timings for notes, extending to both early and late inputs. Through early testing, we realized a non-linear note marker approach speed with a smooth slowdown in the last moments of translation both heightened the sense of velocity and intensity of the experience and simultaneously felt smoother and more fair to the player.

Given the nature of our implementation, a streamed beat-map rather than a buffered beat-map, the best we could do to preempt note events was to spawn them at a fixed interval ahead of when the beat event in the song would occur. We chose 4:00 seconds as our offset, as it was long enough to allow notes to spawn far enough in the distance as to not be visible, but not so far as to make the delay before the first note event hits feel uncomfortably drawn out. From there, we’d instantiate the note with a fairly high translation velocity and smoothly scale it down to almost nothing using Unity’s SmoothStep function, moments before the note should be pressed.

This was a purely visual “game-feel” implementation, the mechanical implementation was driven entirely through timers and offsets. To achieve this, we’d grab the timelineInfo.currentPosition (current position in the song in milliseconds) from the MusicManager on note instantiation, add the fixed offset (4:00 seconds, or 4000 milliseconds) and save this as our target. Then, on player input, we’d check to see if the note was within a certain threshold (400ms or 0.4 seconds) of our target. If not, the input would be ignored, but if so, we’d check against a series of increasingly tight windows to determine if the input was a hit, a miss, or a “perfect” hit. Additionally, if a note ever went over the target + the 400ms window threshold, we’d discard the note and count it as a miss.

Building a Beat-map in FMOD

Given the scope of the project, and the consequential decision to stick to mapping a single song, we decided a manual beat-mapping approach would make the most sense.

FMOD is in no way an optimal tool for this task, but it does a surprisingly decent job. Tempo markers allow you to dynamically set a BPM and time signature to assist with beat marker quantization, but you have the option of free-handing as well.

For the beat events, themselves, we used FMOD destination markers (more on this in the next section). After aligning each destination marker with their respective gameplay-relevant beat event, all that’s left unaccounted for is our 4:00 second spawn-in offset. Because we’re streaming all our data in, rather than trying to buffer 4:00+ seconds of it, the easiest solution here, and our chosen solution, is to simply change the FMOD event display mode from ‘Beats’ to ‘Time,’ move the tempo marker exactly 4:00 seconds later, and then select all markers on the logic track and drag them forward 4:00 seconds. Using the tempo marker as a drag point or anchor will preserve any note quantization in the destination markers that may have otherwise been lost, if trying to eyeball the marker translations forward by the 4:00 second offset.

The final draft of the beat-map, featuring music composed by Macaulay Szymanski.

Spawning Beat-map Events in Unity

In FMOD, destination markers are typically used in combination with transition markers or transition regions to allow dynamic repositioning of the playhead within an FMOD event. In our case however, we’re simply using them in tandem with our BeatEventCallback function in Unity, to note any change in recent TIMELINE_MARKER callbacks. In other words, any time the playhead of the music event passes a new marker, it will return the blittable string name of that destination marker to our lastMarker variable in our timelineInfo struct.

Three FMOD Destination markers from the beat-map (note: vertical offset between markers has no technical function and is purely for my organizational sanity).

The naming of destination markers is functionally critical for our system. The track that a note marker will be generated on is determined by the first character in the (string) name of the marker. However, the only other criteria that’s necessary is that sequential markers with identical first characters have some difference in characters in their name. If they have identical names, and follow each-other in sequence, the second marker will be ignored by our system (fortunately, duplicated destination markers in FMOD Studio are automatically given a unique suffix).

Our marker detection system, first and foremost, is assigned an array of tracks. The order of tracks in the array correlates with the numeric first character in the destination markers. For example, if a marker name is ‘1.2_DT_B,’ a beat event will be spawned on the first track in the array, whereas ‘2.2_DS’ will spawn one on the second track in the array.

From here, we’re comparing the name of the previous ‘stored’ marker (lastMarkerName) to the most recently ‘read’ marker (timelineInfo.lastMarker) on Update (there may be a more efficient way of doing this bit using ID instead of string, but I’m not immediately aware of it). If these two ever differ, we’ve found a new note. In response, we toggle the bool beatSpawnArmed to true and update lastMarkerName.

Now, with beatSpawnArmed reading true, we check only the first character of the lastMarkerName by treating it as an array of characters as opposed to a string. Depending on which character is read, and whether or not the respective track is active, we call the CreateNote function on the target track, and pass in our current position in the music event (song). From there, CreateNote simply instantiates a note on the target track, completing the beat-map spawning system.

A special thanks to Sam, Amorphous, and Mac for joining me with this adventure of a game jam project.

Feel free to reach out if you have any additional questions regarding the article or FMOD Unity beat-mapping systems, and as always, best of luck with your own projects!

Up to date as of FMOD version 2.00.07