Speech-to-Text Business Intelligence Module 02

Voice Summary

Record once. Search the spoken word forever.

Voice Summary turns a phone call into a usable record. The recording goes in; a structured summary, an action list, and a searchable transcript come out — attached to the right customer and project automatically.

The two-hour problem

Most transcription services choke past twenty minutes. Real sales meetings, customer calls, internal reviews — they run long. Voice Summary uses an ffmpeg-based chunker to feed the Gemini pipeline parallel slices, then re-merges them into a coherent narrative.

The shortest path from a meeting to a follow-up email is the one without re-listening.

What you get back

A clean, formatted transcript.
A summary keyed to the customer and project.
An action list with named owners.
An ad-hoc query box: ask the recording anything, get an answer in seconds.

What you get 4 items

Long-form transcription

Internal chunker + Gemini pipeline handles 2-hour recordings without truncation. ffmpeg pre-processing strips silence and chunks parallel batches.

Structured business context

Extracted fields render as readable rows — customer name, project, action items, blockers — not raw JSON.

Ask anything, after the fact

The right-side AI panel takes ad-hoc questions about the recording: 'What did we promise on delivery?' answered in seconds.

Resilient to ops failures

Stuck-row sweep + manual retry button. A killed worker no longer leaves a recording spinning forever.

Compose with 03 modules

Customer Relationship Management

CRM

Customers as a first-class object, not a row in a spreadsheet.

See the module

Conversational Analytics

AI Chat Summary

LINE conversations become a clean ledger of what was actually agreed.

See the module

Project Management

Projects

The work, the people, and the conversations — on one timeline.

See the module

Frequently asked 03

How long can a single recording be?

Tested on 2-hour internal meetings. Audio is split into parallel chunks and re-merged transparently.

What languages?

Thai and English are the production targets. Other languages work but are not officially supported yet.

Where is the audio stored?

Inside your tenant's Drive (Google Workspace). The platform never holds raw audio outside the tenant boundary.

The two-hour problem

What you get back

Long-form transcription

Structured business context

Ask anything, after the fact

Resilient to ops failures

CRM

AI Chat Summary

Projects

Start with a 45-day free trial.