Inventor Ray Kurzweil talks with Electronic Design Editor-In-Chief Mark David about the Kurzweil/National Federation of the Blind portable text-to-speech reader. In discussing the engineering challenges of integrating a digital camera and a PDA into a new device, Kurzweil also considers timing inventions by predicting the future, the future of portable object recognition, and his "law of of accelerating returns."
Mark David: Ray, thank you very much for taking the time to talk to me.
I've been interested in the contributions you've made over the course of my career, in that I spent a lot of years at a magazine called Automatic I.D. News , which covered alternatives to keyboard data entry, optical character recognition (OCR), bar code, and so forth. So, I was aware of a lot of your work in the OCR field. It's a pleasure and an honor to talk to you.
And then, I'm also a musician. So I'm certainly aware of your contributions there on the keyboard side, too. Was it the first sampling keyboard? Is that what your initial invention was there?
Ray Kurzweil: Well, it's the first electronic keyboard that could accurately recreate the grand piano and other orchestral instruments.
Mark David: So, it was really the level of sampling and...
Ray Kurzweil: Yeah; it was more than sampling, although it did incorporate sampling. It really modeled the response of a piano. Because if you just sample a piano, it doesn't convincingly recreate it. For example, samples will loop the last wave form because they don't really have enough memory to have the note sustain for 30, 40 seconds. When you loop the last wave form of a piano, all the overtones become perfect multiples of the harmonic, of the fundamentals. And this begins to sound like an organ.
Mark David: Right.
Ray Kurzweil: One of the things that make a piano sound unique is that the partials are actually slightly off the perfect multiple. They are called enharmonic. And there's a lot of other details like that [which] samples fail to capture. If you hit middle-C harder, it's not just louder. There's high frequency partials [which] attack more quickly and die off in a different pattern.
So we captured all of those subtle differences—really modeled the acoustic response of a piano—and it really sounded like a piano and felt like a piano. And we did AB tests with concert pianists and they were successful. Other samplers just didn't come close.
So, it was the first really successful recreation of complex acoustic instruments like the piano in an electronic instrument.
Mark David: Great contribution. That's great.
Ray Kurzweil: Thanks.
Mark David: So today, the reason for the interview is to talk about the reader for the blind. But, I'd like to have you give me a little bit of context for that.
I know you've been working on it for decades, and it seems like a natural confluence for somebody who's working in OCR to be involved in working with the blind. But, I wonder if you could talk a little bit about which came first, how you got involved with the National Federation for the Blind, and how that dovetailed with the early work you were doing in OCR?
Ray Kurzweil: Sure. I mean, my connection to this project originally comes from my interest in pattern recognition, not from a personal relationship to blindness. Which is relative.
Back in the '70s, Omni Font—any font character recognition was a classical problem, an unsolved problem. And the state of the art at that point was called template matching, which literally just had a pixel for pixel picture of what an A and a B and all the other letters and characters looked like. It couldn't even normalize for things like size, and it tripped up by letters touching each other, and all kinds of vagaries of real world print.
And, in fact, they were used in something called "type and scan," where people would take an original document and retype it using either Courier or OCRA type font. It had to be unit width; it couldn't be proportionally spaced. And then they could scan those typed documents.
And you might wonder what's the point of that since it's not eliminating the manual keyboarding step? But actually, in those days terminals were very expensive and an electric typewriter was a lot cheaper, so, it was actually worth it to retype the document?
Mark David: To retype to get it scanned in? Now, that's interesting....
Ray Kurzweil: Using an inexpensive typewriter—rather than these very expensive c omputer terminals. So I developed—with my team at Kurzweil Computer Products—the first Omni Font. And it could also deal with broken letters, letters touching each other, proportionally spaced print, photocopies, and things like that. It was a bit of a solution in search of a problem.
We were aware of the blind reading problem and also commercial applications, and we were reviewing these different markets. I happened to sit next to a blind gentleman on an airplane, and he was telling me how blindness is really not a handicap. And he represents his company. He flies all over the world, which he was doing right there, and conducts business all around the world. But then he said, 'Actually there's one handicap I do have, which is [that] I can't read ordinary printed material.'
And Braille is only 3 percent of the books, and most of the stuff he reads isn't books anyway. And same thing with tape recordings and recorded books.
If I could read my inter-office memos and other printed material on my own, that would overcome this handicap. And that was inspiring enough for us to decide that would be the focus of the project. This was back in '74.
We went looking for organizations we could work with that would support the project. A lot of other organizations were interested, [but it was this] 'let us know how it goes' kind of thing. But the National Federation of the Blind was immediately enthusiastic and wanted to work with me.
And they were, of course, very helpful with funding. We raised money from foundations and from the government. But they really wanted to work with me on every facet of the project. And that's what we did. They organized seven scientists and engineers who worked with my development team, and they really?
Mark David: And that, again?
Ray Kurzweil: ?Got very close?
Mark David: ...That's going all the way back to the beginning of the....
Ray Kurzweil: ...It's going back to the '70s.
And so, they worked with me on the development and the user interface and the testing, as well as things like marketing and manufacturing.
So it was a really close collaboration on the first print-to-speech reading machine— the Kurzweil Reading Machine—which we introduced January 13, 1976. And actually demonstrating it was Jim Gashell [sp], who is having the same role now....
Mark David: ...Oh, that's neat.