The Movement Babel Problem: Why Motion Capture, Physical Therapy, and Game Animation Can't Talk to Each Other

Five Languages for the Same Thing

Imagine you are a researcher trying to understand how a particular movement pattern leads to injury. You want to combine data from three sources: a biomechanics lab that studied the movement under controlled conditions using force plates and reflective markers, a physical therapy practice that has treated two hundred patients with injuries related to this movement, and a professional sports organization that has motion capture data from dozens of athletes performing the same movement over multiple seasons.

Each of those sources has exactly the data you need. Each has been recording that data carefully, for years, in digital formats. And yet combining them would require a software engineering project of significant scope — because they each recorded that data in a completely different format, using completely different vocabulary, designed around completely different use cases.

This is the Movement Babel Problem. Multiple communities studying the same subject, speaking mutually incomprehensible technical languages.

How Each Field Describes Movement

To understand why the languages are incompatible, it helps to see what each field actually captures when it records movement.

Biomechanics and motion capture

Biomechanics research records movement as kinematics — the position, velocity, and acceleration of body segments through space over time. The standard formats here are C3D (a binary format common in clinical and research motion capture), BVH (Biovision Hierarchy, widely used in 3D animation), and FBX (Autodesk's format, dominant in game and film production).

These formats are excellent at answering the question: where was each part of the body, at each moment in time? They are poor at answering questions like: which muscles were producing force, at what activation levels, against what external resistance? Kinematic formats describe the output of movement, not the physiological mechanism that produced it.

What biomechanics captures well

Joint angles, segment positions, velocity, acceleration, temporal sequence. Useful for analyzing movement form, comparing technique between athletes, studying injury mechanics.

Physical therapy and clinical assessment

Physical therapy records movement in a completely different vocabulary. Range of motion is measured in degrees using a goniometer and recorded as discrete measurements — "shoulder flexion 140 degrees active, 155 degrees passive" — rather than as a continuous time-series of positions. Muscle strength is assessed using manual muscle testing scales (typically a 0-5 ordinal scale). Functional movement is often described in qualitative terms: "patient demonstrates pain-avoidance compensation in left hip during single-leg squat."

These assessments exist in clinical notes, often as free text, in electronic medical record systems that were designed for billing and compliance rather than data portability. The information is clinically meaningful but computationally opaque — it cannot easily be processed, compared across patients, or combined with data from other sources.

What clinical assessment captures well

Functional limitations, pain patterns, compensation behaviors, patient-reported symptoms, treatment response over time. Rich in clinical context; poor in quantitative precision and machine readability.

Fitness training and coaching

Strength and conditioning coaches have their own notation systems, many of them informal or coach-specific. A training program might specify "3x8 @ 75% 1RM, 3:1:1 tempo, 2 min rest" — which experienced coaches can interpret but which is not a standardized format any software can reliably parse. Exercise names vary: one coach calls it a "Romanian deadlift," another calls it an "RDL," a third calls it a "stiff-leg deadlift," and they may not all be describing the same movement.

Wearable technology has brought more quantitative data to fitness — heart rate, acceleration, estimated calorie expenditure — but this data captures general exertion rather than movement specifics. A wearable can tell you that you were moving vigorously for forty minutes. It cannot tell you which muscles were loaded, at what joint angles, with what compensation patterns.

What fitness coaching captures well

Loading parameters, progression over time, exercise selection, perceived effort. Weak on anatomical specificity and kinematic precision; strong on training context and longitudinal structure.

Game animation and virtual worlds

Game animation deals with movement as a visual representation problem. The concern is making a character look convincing, not accurately modeling the physiology that would produce a real movement. Formats like BVH and FBX carry skeleton hierarchies and joint rotations, but these are optimized for rendering on a game engine, not for representing what muscles are doing or how load is distributed through a joint.

More recently, games and virtual worlds have started caring about movement fidelity for different reasons — rehabilitation games, sports training simulations, avatar platforms used for social interaction. In these contexts, the visual approximation that works for entertainment is insufficient. You need movement data that is physiologically meaningful, not just visually plausible.

What game animation captures well

Visual fidelity, temporal sequence, blend states, transition behavior. Optimized for rendering, not for physiological accuracy or cross-platform interoperability.

EMG and electrophysiology

Electromyography records electrical activity in muscles directly — the most precise available measure of which muscles are active and at what intensity. EMG data is stored in formats specific to the equipment manufacturer, typically requiring the same software to read that was used to record the data. It is rarely combined with kinematic data in practice, because doing so requires custom integration work and specialized expertise.

EMG gives you the physiological signal that kinematic formats are missing — actual muscle activation, not inferred from position. But it exists in an even more isolated silo than the other formats, used almost exclusively in research and clinical contexts, with almost no pathway to the fitness, gaming, or rehabilitation technology markets.

The Conversion Tax

Working across these formats is not impossible. Researchers do it. But they pay a significant cost every time they try — what you might call the conversion tax.

Converting between kinematic formats (BVH to FBX, for example) is relatively straightforward when the skeleton hierarchies match. But converting between fundamentally different representations — kinematic data to clinical assessment language, or fitness training notation to anything a game engine can use — is not a technical problem with a clean solution. It requires domain expertise, manual interpretation, and inevitably some information loss.

The conversion tax is paid in time, in money, in expertise, and in the information that doesn't survive the translation. A researcher who wants to combine motion capture data with clinical assessment records is not just doing an import operation. They are doing interpretation work, reconciling two different conceptual frameworks for what movement is and how it should be described.

This tax compounds over the life of a research program or a product. Every time data crosses a domain boundary, the conversion is paid again. Data that crosses multiple boundaries — from EMG research to clinical practice to rehabilitation gaming, for instance — pays the tax multiple times, and the information loss accumulates at each crossing.

Why Existing Standards Haven't Solved This

There have been attempts at cross-domain movement standards. HL7 FHIR, the healthcare interoperability standard, can technically represent exercise and movement data through its Observation resource type. OpenSim, the biomechanics simulation platform, has its own XML-based formats. Several sports science organizations have developed data exchange standards for specific use cases.

None of these has achieved broad adoption across domains, and the reasons are instructive.

Healthcare standards like FHIR were designed by and for healthcare organizations, with healthcare regulatory requirements as the primary constraint. They are expressive enough to record that a patient performed a squat and experienced pain, but they don't have the vocabulary to record that the patient was performing a goblet squat with 20kg at a 3-second eccentric tempo with visible knee valgus compensation. The granularity simply isn't there for training and rehabilitation use cases.

Research formats like OpenSim are designed for scientific reproducibility and precision, which makes them too complex for practical use in coaching, therapy, or game development. You don't need a full musculoskeletal simulation model to record what a trainer did in a workout session.

Sport-specific standards tend to be narrow by design — appropriate for their specific use case, but not generalizable to the full range of contexts where movement data matters.

What a General Movement Notation Needs

A notation that could actually serve as a common language across these domains would need to satisfy several competing requirements simultaneously.

It needs to be physiologically grounded. A format that only captures kinematics — where body parts are in space — will never satisfy clinical and rehabilitation use cases that care about what muscles are doing and why. The notation needs a vocabulary for muscle activation, nerve involvement, and compensation patterns.

It needs to be human-readable. A binary format or a format that requires specialized software to read will reproduce the same accessibility problems that already exist. If a physical therapist can't write a movement record in a text file and have it be immediately compatible with a training platform, the standard hasn't solved the right problem.

It needs to be machine-parseable with a formal grammar. Informal notation systems — like the varied ways coaches write training programs — are human-readable but computationally unreliable. A standard needs a defined grammar that a parser can implement deterministically.

It needs to be extensible without breaking existing implementations. The first version of any standard will not anticipate every use case. The design needs to allow for extension — new movement types, new measurement parameters, new application contexts — without requiring existing implementations to be rewritten.

It needs to work at multiple levels of detail. A basic training log entry and a full clinical assessment describe movement at very different levels of precision. A useful standard should support both, with a clear way to indicate how much detail is present and how to handle the rest as optional.

HMN: A Proposal for Common Ground

Human Movement Notation (HMN) is AIUNITES's attempt to build a notation family that satisfies these requirements. It consists of three related sub-notations: MNN (Muscular Neuro Notation) for body movement, VRN (Voice Resonance Notation) for vocal production, and VNN (Voice Neural Notation) for AI voice synthesis.

MNN specifically addresses the movement Babel problem by building a notation vocabulary that bridges the physiological detail of EMG and clinical assessment with the practical usability of a training log format, and the machine-parseability required for software implementations.

The notation uses a bracket-based structure where different aspects of a movement can be specified independently. A complete record can include muscle activation levels, joint positions, resistance vectors, loading parameters, and compensation flags — or it can include only the elements that are known, with the rest simply omitted. The grammar is designed so that a parser can handle partial records without failing.

# Full clinical-level record
{Squat.V} [Con:[Quad:+++][Glut:++][Para:+]] [Pos:[Hip:Flex90,Kn:Flex90,Ank:DF15]] [Vec:V:Low] [Load:BW] [Comp:[KnValgL:mild]]

# Minimal training log record  
{Squat.V} [Load:100kg,Sets:3,Reps:5]

# Both are valid HMN. The parser handles both.

This flexibility is deliberate. The goal is not to force every context to record the same level of detail. It is to ensure that when a clinical assessment and a training log are both describing a squat, they are using compatible vocabulary — so that a system that can read one can extract meaningful information from the other, even if the other contains less detail.

The Path to a Common Language

Solving the Movement Babel Problem doesn't require every field to converge on a single way of working. Motion capture researchers will continue to use C3D for the precision it provides. Game animators will continue to use BVH and FBX. Clinical systems will continue to run on HL7 FHIR.

What a common notation enables is a translation layer — a format that any of these systems can write to and read from, without any of them needing to abandon their primary workflow. Movement data that exists in MNN format can be consumed by a rehabilitation platform, a fitness app, an AI coaching system, and a virtual world avatar platform, all without custom integration between each pair of systems.

The web didn't replace every document format. It created a common layer that connected them. A movement notation standard has the potential to do the same thing for the fragmented landscape of human movement data — not by forcing everyone to speak one language, but by giving them a shared vocabulary for the things they all need to say about the same subject.

Explore the HMN Specification

The full Human Movement Notation specification — including MNN, VRN, and worked examples across use cases — is freely available and open for implementation.

Read the HMN Spec → Try the MNN Builder →