VibeSync App
application that uses machine learning to synchronize dance video
and audio based on object pose estimation.
Our goal was not only to eliminate the delay between the video and audio recordings but also to adjust the movement speed to match the music beats or even recommend appropriate music for the user's dance video.
The task was to investigate the existing Python frameworks, whitepapers, and libraries, and provide an effective tool for creating appealing musical clips.
We sourced data from multiple publicly available databases. Through rigorous testing, we chose the best ML model and refined it until we arrived at an MVP solution.
Our solution


Google AIST++ , a dataset that contains
the key points of dancers' bodies proved to be
a viable solution for dance embeddings.
We used the following resources in
our research:

Tech Stack
Python
OpenCV
PyTorch
Results
To complete this project, we conducted extensive research of the AI theory, turned the academic whitepaper into a working machine learning model, and designed an MVP application.

We taught the model to:
Isolate the rhythm from the audio.
Extract visual tempograms from the video.
Analyze visual and audio tempograms to find a match.
We also incorporated a feature that enables video speed adjustment.

Get in Touch
for modern AI development.