DJ Hero Review
Nissan's LandGlider Narrow track vehicles - the convergence of the car and the motorcycle
Emue and Visa Europe have been working closely over the past 18 months to develop the Visa... Anti-fraud credit card features E-Ink display
SPDY from Google's Chromium development team has achieved 55 percent faster page loading t... Google SPDY aims to make web faster
BMW has brought back the C1 as an electric-powered concept scooter called the C1-E E is for electric: The BMW C1-E concept scooter
Yes, that's supposed to be a piece of underwear. No, me neither. C-string makes your average thong look like grannypants (NSFW)
MORE TOP STORIES »
GOOD THINKING

Asia Online – the world’s most significant literacy project (and internet investment opportunity)

By Mike Hanlon

01:17 September 22, 2008 PDT

Page: 1 2 3 4 5 6 7 8 9 10 11

The opportunity in one image

The opportunity in one image

Image Gallery (22 images)

“The more domains we cover, the better it gets. We have some domains now which are being translated extraordinarily well. French to English for legal documents is currently running at a BLEU score of 62 – a human scores around 65 on the BLEU scale, so we’re getting pretty good in that area.

“We now need to work up to nailing as many specific domains as we possibly can.”

“What you would have dealt with previously is rules based machine translation. Rules based translation involves putting a bunch of very smart linguists and very smart programmers and locking them in a room for five years.

“The problems with rules based translation is that it’s flat when the outcome is produced. If I feed it Harvard Business Review, I want the translation to read like Harvard Business Review. If I feed it an Enid Blyton childrens book, I want it to read accordingly, and I definitely don’t want it to read like Harvard Business Review or vice verca.

“With rules translation, you can’t handle that – with the statistical translation we use, we can handle that, and can stylise the output to read like a particular genre of literature. It requires data and we’re in the early stages of gathering a lot of this data right now. Every week we get much much better.”

“We already have all 23 European languages operational to a certain degree. So we can translate directly from Danish to Dutch or Greek. We’re doing the same for all the Asian languages. Currently, if you want to translate a Thai document to Vietnamese, you have to translate it to English first and then get the English translated to Vietnamese. Very shortly we’ll be able to do it directly.

“Many of the documents we are currently using to make our translation engines stronger are legacy documents from translation service providers, so we might have a book in English and the same book in Japanese. We then get that same document in Thai and feed that into the system. We then marry those translations, using the English version as the key and we then have Japanese to Thai language pairing and can work towards building its accuracy.

“Where this gets interesting is with minority languages such as Khmer that has very little data available”, Wiggins says.

...continued

Page: 1 2 3 4 5 6 7 8 9 10 11

Tags
Post a Comment

Login with your gizmag account:




Or Login with Facebook:


Connect
Gallery Images
Related Articles Email this article to a friend

Just enter your friends and your email address into the form below ...




Privacy is safe with us because we have a strict privacy policy.

Recent popular articles in Good Thinking
Recent Comments