Timeline:
Question.
Answer 1.
Majority of Genny users create content with voice
Manually adding subtitles takes 5 to 10 times the length of the video, but by utilizing Speech-to-Text technology, work time is reduced by 80%.
This enhances the work efficiency of Genny users who create voice-based content.
Answer 2.
There was continuous demand from users for a feature that would allow exporting scripts or subtitle files generated by AI Voice. Additionally, since creating subtitles is often the last step of video editing, I determined that by offering the AI Auto Subtitles feature, Genny could serve as the endpoint in the user's workflow. I also expected that this would contribute to higher retention.
The simpler and more common a feature is, the more important the technology and usability become. I revisited the purpose behind the feature and the user's pain points when planning the technology and underlying logic.
Technology
① Finding the Perfect Speech to Text Provider
I compared 5 different Speech-to-Text providers with the engineers and found the "Perfect Provider," an API that aligns with the target users and the product’s goal.
The selection was based on several criteria:
Speech-to-Text accuracy
Ability to handle multiple languages with consistent accuracy
Processing speed acceptable to users
Cost, rate limits, and other technical factors
Logic
② Designing the Perfect Timing for Subtitles
To reduce work time by 80%, it’s not just about accuracy. After selecting the perfect provider, I focused on precise timing and appropriate subtitle lengths.
In addition to accurate speech recognition, it was crucial to split the subtitles into readable lengths for viewers.
Split Logic:
Maintain Context: To preserve the flow of the sentence, subtitles are split based on punctuation marks.
Readable Length: If a sentence is too long, it is divided into segments of around 60 characters, which is the optimal length for readability on a single screen.
Feature #1
Subtitles Editor
Designed a Subtitles Editor that lets users easily check subtitle content, timestamps, and styles at a glance.
And simplified the subtitle regeneration process into a single click.
Feature #2
Customize Subtitles Style
Introduced per-line subtitle styling to offer greater creative flexibility,
and differentiate the product from competitors.
Feature #3
Auto Subtitles Flow
Created an Auto Subtitles flow that simplifies the process for users who only need subtitle generation,
allowing them to upload media and automatically generate subtitles in one step.
This was one of the projects I was most passionate about during my time at LOVO. It took place while our team was adopting a new management methodology, and I still remember staying up late working on design and QA with my teammates to meet tight deadlines. That shared effort made the project even more meaningful to me.
It marked a pivotal moment for the company, redefining the direction of Genny — from an AI voiceover tool to an AI video editor. I gained valuable experience by planning and designing core features. Despite many trials and errors, it became a major turning point in my career.
Building AI functionality came with many unexpected challenges throughout the planning, design, and QA stages. Sometimes, API limitations were discovered while reviewing the design, requiring us to revise the scope. In other cases, we had to make fast decisions and compromises due to tight deadlines. These challenges pushed me to my limits and ultimately helped me grow. I also learned how to maximize the impact of AI technology and solve its limitations through thoughtful UX and UI design.
Through this project, I significantly deepened my understanding of AI feature planning and learned to approach challenges with a forward-thinking mindset. Beyond design growth, it also taught me the value of communication and collaboration. That’s why this project remains especially meaningful to me.