Brute Force Speech Recognition

Brough Turner suggests modeling speech recognition after Google’s machine translation service, which translates from one language to another by finding the closest match in a huge corpus of translated documents. Google’s translation corpus consists of billions of pages of UN documents translated into every known language. If we can find a similar Rosetta stone for speech, consisting of a huge amount of audio transcribed into text, the same techniques could be applied.

Brough doesn’t name any libraries but hints that he has some ideas. I have some, too; say legal depositions, or closed captioning on TV.

How to Make a Million Dollars

Some inspiration for prospective entrepreneurs from Marshall Brain: How to Make a Million Dollars

You can buy lottery tickets or head for Vegas and hope for the best.

Or there is the well-worn path to a lawsuit. The problem is, a lawsuit can take a long time and you have to spend most of that time talking to lawyers. I’m not sure the rewards outweigh the pain.

It seems like there should be an easier way, and in fact there is…

Text to Speech and Back Again

Michael Hiemstra, MAKE: Blog, and others are writing about Speakwire, a website that converts the text of RSS feeds to audible speech.

It’s a neat idea that got me thinking. What about going the other way, from speech to text?

Podcasts are all-audio, which means traditional search engines, like Google, can’t index them very well. Search engines must rely on text descriptions that accompany podcasts in order to guess at their contents. Are there any webapps that automatically convert podcasts to text, or search the text of podcasts?

As it happens, there is: Podscope. From the FAQ,

Podscope is the first search engine that actually allows you to search for spoken words within any audio or video file. We’re starting with podcasts and will be adding all types of multimedia in coming months.

Dang! There goes that million dollar idea.

I wonder if they’re planning on adding a PubSub-style utility where you could subscribe to a search and be notified when new hits appear. That might be worth a million bucks. Of course, now that I’ve mentioned it, Bob Wyman will probably whip up a prototype in a couple of days.

Solution to SICP Exercise 1.24

Structure and Interpretation of Computer Programs

Solution to Exercise 1.24:

We need to modify start-prime-test to use fast-prime?. Here are my modifications:

(define (start-prime-test i n start-time)
(if (fast-prime? n 10)
(if (= i 0)
(report-prime
(time-difference
(current-time time-process)
start-time))
(start-prime-test (- i 1) n start-time))))

The inevitable problem when switching to fast-prime? is deciding what value to use for times. I’ve somewhat arbitrarily decided to use a constant value of 10 for now, which gives the following output:

Welcome to DrScheme, version 209.
Language: Textual (MzScheme, includes R5RS).

1009 *** 0s 2190000ns
1013 *** 0s 2340000ns
1019 *** 0s 7650000ns
(# # #)

10007 *** 0s 2810000ns
10009 *** 0s 2820000ns
10037 *** 0s 2810000ns
(# # #)

100003 *** 0s 5940000ns
100019 *** 0s 5470000ns
100043 *** 0s 5780000ns
(# # #)

1000003 *** 0s 6410000ns
1000033 *** 0s 3600000ns
1000037 *** 0s 3750000ns
(# # #)

As log1000=3 and log1000000=6, we should expect the processing times for the primes near 1000000 to be twice that of the primes near 1000.

The average time for the primes near 1000000 is 4586667ns. For those near 1000, 4060000.

These are very close; definitely not the factor of two we expected. What’s going on?

Luckily DrScheme has a built in profiler. Perhaps it can shed some light on what’s going on.

Here is the breakdown of time spent in each procedure for the values near 1000 (in milliseconds).

FunctionNumber 1009 1013 1019 Average
start-prime-test 672 766 797 745
fast-prime? 641 751 749 713.67
try-it 563 719 608 630
expmod 516 687 593 598.67
square 78 187 109 124.67
fermat-test 0 0 32 10.67
report-prime 0 0 0 0
timed-prime-test 0 0 0 0

And here are the numbers for the values near 1000000.

FunctionNumber 1000003 1000033 1000037 Average
start-prime-test 1546 1578 1719 1614.33
fast-prime? 1530 1562 1704 1598.67
try-it 1499 1436 1580 1505
expmod 1483 1420 1533 1478.67
square 284 392 235 303.67
fermat-test 15 32 15 20.67
report-prime 0 0 0 0
timed-prime-test 0 0 0 0

Comparing the results:

Function Ratio Deviation from expectations
start-prime-test 2.17 8%
fast-prime? 2.24 12%
try-it 2.39 19%
expmod 2.47 23%
square 2.44 22%
fermat-test 1.94 -3%

As you can see, the measurements aquired by the profiler match our expectations quite closely.

Lesson of the day: DrScheme’s profiler give much more accurate measurements than its implementation of SRFI 19.

Microsoft’s Remarkable New RSS Reader

Larry remarks on Microsoft’s new beta RSS reader:

Their new beta RSS reader, which Dare Obasanjo points to, asks inane questions before letting you try it. For example, “what month did Florida become a state?” They of course provide a link to MSN Search for you. Funny thing though, my first instinct was to go to Google.

Guys, if you want to increase MSN Search usage, don’t try to trick me into it.

Something tells me that this isn’t exactly what Seth Godin means by “remarkable”.