Insights

The 200-Hour Year: How Voice-First Workflows Quietly Change What You Earn

Personal Efficiency May 1, 2026 12 min read

Brian Galvan Founder, TypeSay

There is a quiet number sitting in the middle of every knowledge worker's calendar that almost no one calculates.

The average professional types at roughly 40 words per minute. Average human speech runs at 130 to 150 words per minute, and modern speech-to-text captures it accurately. A peer-reviewed multi-country study published in 2025 measured the difference directly: median typing speed of 21.4 wpm versus median dictation speed of 93 wpm among the same group of clinicians. A 4.3x raw speedup. A 2.5x speedup even after accounting for editing errors.

A Stanford study comparing the two on smartphones found speech recognition was nearly three times faster than typing and produced fewer errors.

Now apply that to your actual workweek.

Where Your Hours Actually Go

The data on knowledge work is unambiguous, and it has been getting more precise every year:

28% of the average knowledge worker's week is spent on email, roughly 11.2 hours, according to McKinsey research cited in cloudHQ's 2025 analysis. Over a year, that is more than 580 hours. Over a 45-year career, nearly 3,000 working days. Spent. On email.
The average employee writes 112 emails per week, spending around five and a half minutes on each. That is about 11 hours of pure writing, not counting reading or sorting.
Office workers spend up to 2.5 hours every day on email-related tasks.
A separate APQC study found knowledge workers get only 30 productive hours out of a 40-hour week. The rest disappears into communication overhead, context-switching, and rework.

These are not dramatic statistics. They are the boring, unsexy reality of how white-collar work actually consumes time.

Now do the arithmetic on what voice-first workflows would change.

The Math, Conservatively

Take a professional who spends 11 hours a week composing written communication: emails, Slack messages, briefs, notes, drafts. That is the McKinsey/Slack baseline.

At 40 wpm typed, 11 hours produces about 26,400 words.

At an effective dictation speed of 100 wpm, well below the 150 wpm theoretical ceiling, leaving generous room for editing, the same 26,400 words takes 4.4 hours.

That is 6.6 hours back, every single week.

Annualized: roughly 315 hours. Nearly eight full 40-hour workweeks returned to you each year.

A peer-reviewed review of medical dictation studies found a more modest 5.76% productivity gain when physicians switched from typing to speech-to-text, and even at that conservative number, the same review calculated about seven hours per week in time savings for the average U.S. physician.

Whether you trust the high estimate or the low one, the directional answer is the same: speaking is faster than typing, and the gap compounds across thousands of hours of professional work.

What 200+ Extra Hours a Year Is Actually Worth

Here is where productivity articles usually get vague. Let us not.

If you bill at $100 an hour, modest for most professionals reading this, 200 reclaimed hours is $20,000 a year in either direct billable revenue or backfilled high-value work.

At $200 an hour (consultants, attorneys, specialized contractors), it is $40,000 a year.

For a salaried professional, the math is different but no less real. The reclaimed time goes into:

One more deep-work block per day, where the meaningful output actually lives
The proposals, pitches, and outreach that compound into promotions and raises
The book, course, side project, or business that you "never have time to start"
The deliberate practice that separates competent professionals from sought-after ones

Every productivity researcher who has looked at high performers has found the same pattern: they do not have more hours in the day. They have fewer hours wasted on input mechanics, typing, filing, formatting, and more hours pointed at the work that actually moves a career.

Voice-first workflows are one of the cleanest, lowest-effort ways to shift that ratio. There is no new methodology to learn. No team to coordinate. No new app to live inside. You just stop typing the things you used to type.

Where This Works in Real Workflows

The skeptic's objection is reasonable: I already type fast. I think while I write. Dictation feels weird. All true at first. Here is where voice actually wins, even for fast typists:

Email and Slack replies. The vast majority of professional messages are conversational in structure. You already speak them in your head before typing them. Saying them out loud is faster and produces a more natural tone. A two-minute voice reply replaces a seven-minute typed one.

First drafts of anything. Writing experts have argued for decades that the worst thing you can do to a first draft is type it slowly, because you edit while you write and lose momentum. Dictating a draft at 120 wpm and then editing it at typing speed reliably produces better output in less time than typing a "polished" draft from the start.

Meeting notes and follow-ups. The five minutes after a meeting, while context is fresh, is when good notes get written or never get written at all. Speaking them takes 90 seconds. Typing them takes ten minutes you do not have.

Documentation, briefs, memos. Anything where the structure is in your head and the bottleneck is just getting it onto the screen.

Code comments, commit messages, ticket descriptions. Developers benchmark this constantly. The people who try voice-first comments rarely go back.

Long-form content. Blog posts, articles, newsletters, LinkedIn posts. The published authors who use dictation routinely report two to three times the daily word output, and the prose reads more like them, not less.

The Hidden Second Benefit

There is a subtler payoff that the productivity stats miss.

Typing is a cognitive bottleneck. Your fingers move at 40 wpm. Your thoughts move at several hundred. The gap between those two speeds is where ideas evaporate: the sentence you wanted to write but had typed out a slower one by the time you got there. The example you almost remembered. The argument that fell apart while you were still keying in the introduction.

Speaking closes that gap. You think and the text appears. The thinking gets captured before it fades.

Professionals who switch report the same thing in different words: I get more of my actual ideas onto the page. Not just faster output. Better output, because less of it is being lost to the input mechanism.

That is the productivity gain that does not show up in a wpm benchmark. It might be the most valuable one.

Why Local Matters Here, Too

Most voice-to-text tools were built for a different era, when "send your audio to our cloud" was an acceptable cost of using the tool. It is increasingly not. The data captured by your dictation workflow is the most sensitive data you produce: client conversations, draft strategy, half-formed ideas, candid first reactions.

The professional who actually uses voice all day generates a constant stream of unfiltered, high-context audio. If that stream is leaving your machine, it is being stored somewhere, by someone, for purposes that may include training models you do not want trained on your work. Recent class actions targeting AI transcription vendors have made this risk specific rather than theoretical.

TypeSay solves the productivity problem without creating the privacy one.

The Whisper model runs locally on your machine. Audio is captured, transcribed, and discarded in memory. Nothing is saved. Nothing is sent. Hold a hotkey, speak, release. Your words appear at your cursor in any application: email, Slack, VS Code, Word, your browser, your terminal. Anywhere a text field exists.

It is, architecturally, the same speed advantage cloud dictation tools offer, with none of the cost.

The Cost Question

TypeSay is $199. One time. Forever.

If voice-first workflows return even a single hour per week, and the research suggests they return five to seven, the tool pays for itself in the first week and continues paying you back every week after for as long as you own a computer.

There is no monthly fee. There is no per-seat pricing. There is no surprise renewal. Pay once and the math compounds in your favor for the rest of your career.

A professional who recovers 200 hours a year, for ten years, against a one-time $199 cost, is not making a software purchase. They are making a 2,000-hour trade for $199.

That trade only gets better the longer you make it.

TypeSay is private, local-first speech-to-text for Windows, macOS, and Linux. $199, one time, forever.

200+ Hours. $199. Once.

Stop typing. Start speaking. Keep everything local.

TypeSay gives you the speed of dictation with the privacy of local-only processing. No cloud, no subscription, no compromise.

Get TypeSay — $199