Rockbox

  • Status Closed
  • Percent Complete
    100%
  • Task Type Patches
  • Category User Interface → Language
  • Assigned To
    rasher
  • Operating System All players
  • Severity Low
  • Priority Very Low
  • Reported Version Daily build (which?)
  • Due in Version Undecided
  • Due Date Undecided
  • Votes
  • Private
Attached to Project: Rockbox
Opened by Whick - 2008-08-13
Last edited by speachy - 2020-12-11

FS#9273 - Changing sapi_voice.pl To make japanese.voice sound better with specific SAPI

I would like to change voice strings to make them sound better with a specific SAPI voice.

I don’t intend to change string in *.lang directly.
It should be done in voice.pl dinamically while building voice, shouldn’t it?

Closed by  speachy
2020-12-11 20:42
Reason for closing:  Accepted
Additional comments about closing:  

I reworked this for the current codebase and committed it as 60139cf9f1.

Whick commented on 2008-08-13 06:05

Should we write in Hiragana (or katakana?) for japanese voice-strings in japanese.lang?
A knowledge is necessary to answer this qustion.
We can say that Hiragana and katakana are almost phonogramic in almost cases.
But They aren't phonogramic in case of prolonged sound like roman vowel with macron and this rule is not a perfect.

I won't explain the rule here because it's very difficult except for Japanese native speaker.

Beside, the accent of many words may be broken by using Hiraga(or Katakana).

What voice-string is misread by Sapi?

 For now, I talk only about Sapi LH Kenji and LH Naoko bundled office XP or 2003 localized for Japan.
 (It's just aside that Kenji is common male name like Jack and Naoko is common female name like Jane.)
 They misread these voice-strings.
 A.voice-strings in japanese.lang: B.normal spell in hiragana:   C.how should be pronounced (writen in Hiragana): 
 D.how they pronounce: 
 E:What word should be given to sapi, for letting LH Kenji(or Naoko) to pronounce words correctly
   (i.e. if voice.pl replace A to E before give voice-string to sapi_voice.vbs, 
          sapi_voice.vbs makes sound better.)
 A:           B:           C:             D:   E
 "大文字":"おおもじ":"おーもじ":"だいもんじ":"おー文字"
 "背景色":"はいけいしょく": "はいけーしょく":"はいけーいろ":"背景蜀"
 "文字色":"もじしょく":"もじしょく":"もじしょく":"文字蜀"

If I understood right, the attached patch should fix those three examples. Hopefully you get the idea.

Whick commented on 2008-08-14 17:48

I appreciate your continued and considerate help, and have read this patch. I 'd like to ask you wheter if I may use "unknown" vendor in sapi_voice?
It seems kind of unsolicitously for me that distinguish by "unknown" vendor, to stretch the point a bit.
How about to distinguish by anything except vendor attibute.
For example, can voice.pl disinguish by existance of option which are "/voice:LH Kenji" (or "/voice:LH Naoko").

I 'm not familar with perl and policy of rockbox source, so that my opinion might be unconsidered.

I understand your concern, but according to Jens Arnold, this is the first SAPI voice he's seen that didn't specify a vendor, so it is probably not a problem. I don't know how the voice selection is done for SAPI - Jens Arnold is probably the best to comment on this question.

For now, I think it's okay to use the "(unknown)" vendor.

It should be possible to work out the correct vendor from the voice name if the engine doesn't specify a 'Vendor' attribute. However, this requires that the engine properly specifies the 'Name' attribute. This needs to be checked on a machine with those japanese L&H engines installed. I can't do this myself, as I don't have a japanese version of Office 2003, and Microsoft only offers the SAPI4 version of those engines for separate download.

Please run this script: http://www.rockbox.org/twiki/pub/Main/VoiceBuilding/ListVoices.vbs in a console window (cscript ListVoices.vbs) and post the output somewhere (the lines mentioning the problematic L&H voices are sufficient).

Whick commented on 2008-08-15 01:51

I appreciate your attentiveness to my requirements and quick responses.
I runed this script and redirected output to this attached file.(i.e. c:\>cscript ListVoices.vbs>result_from_ListVoices_vbs.txt)

Whick commented on 2008-08-15 02:24

I altered ListVoices.vbs to diplay vendor attribute if sapi have it for other sapi engine users' chcking ther sapi. and attached its output.

Whick commented on 2008-08-15 04:52

I altered ListVoices.vbs to diplay vendor attribute if sapi have it for other sapi engine users' chcking ther sapi. and attached its output.

Whick commented on 2008-08-15 06:22

I altered ListVoices.vbs to diplay vendor attribute if sapi have it for other sapi engine users' chcking ther sapi. and attached its output.

Whick commented on 2008-08-15 07:34

sorry for my duplicate post.

I've just committed a fix for sapi_voice.vbs, so that it now reports the vendor for those japanese L&H engines as "L&H" (same as the other L&H engines actually report).

After this, the vendor in my patch should obviously be changed from "(unknown)" to "L&H".

Does this work as intended?

I upload the result of ListVoices.vbs in my environment.
Vendor of "ドキュメントトーカ" are "Create System Development Co and Ltd."

Whick commented on 2008-08-18 03:23
Does this work as intended?
Jonas, Thank you very much! I think it might work well but can't test it at the moment. I'll confirm the work of current sapi_voice.vbs in two days.

Thank you very much, MoonWolf. Ok, The Vendor attribute of product "ドキュメントトーカ" is "Create System Development Co, Ltd." isn't it.
Then, Could I ask whether your characters (of ドキュメントトーカ) pronouce "voice-strings of current japanese.lang" correctly ? (id est. Don't the sapi of the vendor misread anywords?)
BTW, I've convert YOUR result.txt to other one encoded UTF-8 and upload it just in case.(Because it was encoded in shif-jis because of my insufficient explanation for you.)

In Japanese( for Moon Wolf)
協力ありがとうございます。ドキュメントトーカのVendor Attributeは、"Create System Development Co, Ltd."ということですね。
それでドキュメントトーカのキャラクターは、(最新の)Japanese.langのvoice文字列を正しく発音しているか教えていただけますか? (別の言い方をすれば、そちらのSapiが読み間違えを起こしませんか?)
 なお念のため、MoonWolfがアップしてくれたresult.txtをUTF-8に変換しアップし直しました。(私の説明不足のせいで、Shift-Jisの漢字が含まれていたので)

I upload the result of ListVoices.vbs in my environment.
Vendor of "ドキュメントトーカ" are "Create System Development Co and Ltd."

Sorry duplicate post

Whick commented on 2008-08-21 01:05

I checked three japnese sapi-engines by this alterd ListVoices.vbs, and updated sapi_voice.vbs, voice.pl and japanese.lang.

1. added "use utf8;" code to voice.pl to replace some multibyte voice-strings
2. updated japanese.lang like english.lang ("r18308: Ensure every phrase has a "user:" line - currently they are all empty.")
3. added "distinguishing audio format code" to sapi_voice.vbs after I distinguished each audio format by this alterd ListVoices.vbs (ListVoices.vbs's output is result.txt)

Whick commented on 2008-08-21 08:18

Sorry, that result.txt was shif-jis encoded. this one is utf-8 encoded.

Whick commented on 2008-08-22 07:33

fixe few mistakes and resynced.

Is there anything stopping this from being committed?

Loading...

Available keyboard shortcuts

Tasklist

Task Details

Task Editing