Justin Ohms
2 min readJun 28, 2024

--

Yes but garbage in garbage out. Even a human can’t summarize gibberish. Most of the issues are related to initial transcription.

It depends a great deal on which speech to text system you are using to collect the initial transcript. And those systems can only do so much. (They don’t read minds, and they don’t read lips) It relies on a quality audio system and room acoustics as much as its does clear articulation (especially for participants that have heavy accents.) But even for people that don’t it can only do so much if people mumble, speak softly, or to quickly. Basically all the same things that would make it difficult for a human to understand someone also makes it difficult for speech to text systems to understand someone. And if you don’t have a decent text transcript to start with even a human is going to a struggle to summarize notes from it.

All that said, the biggest problem by far is people interjecting, talking over each other or having side conversations. This really messes up transcription. That all really comes down to decorum, company culture and meeting control.

The other thing to keep in mind is that if you are in a niche industry with lots of special terminology you will probably need to load a custom dictionary into the system for it to return good results.

One way to isolate what the issue are is to have someone that wasn’t there listen to the audio only recording. They will be able to tell you if it’s an equipment issue, a problem with understanding specific individuals or a meeting flow problem.

--

--

No responses yet