(MK) Shin Tenchi Muyo Novels OCR

Forums Tenchi Muyo! Discussion Translation Zone (MK) Shin Tenchi Muyo Novels OCR

  • This topic is empty.
  • Post
    ChaudSept
    Member
    Hey guys !

    After a burn I had days ago, I found another interesting thing, but this time lets not get to excited.

    I hypothetically found the 3 novels Jurai, Yosho, Washu in full OCR mode. Though i don’t know who did it.

    Were you aware that the novels had been OCRed ?

    I’m just gonna wait for your answers (looking at you, the crew), and then I’ll show you (or not) if it’s legit (or not). blush1

    Viewing 16 reply threads
    • Replies
        ChaudSept
        Member
        Day 667 : Ain’t nobody cared

        But i want to know if it’s usable in the future so :

        How to make a short post and showing everything I want to. Meh.

        I will proceed just like the translator proceed, but the meaning of the text is not the issue here ! The legitimacy of the unknown -OCR is.

        This post isn’t mean to begin any translation project, yet.

        Please do not loose time by reading the entire texts , maybe just take simple sentences and compare the differents versions with the original.

        I’ve got the novel on 71 pages of japanese text in word format, we can easily say that the translation process is reduced by .. at least … 25% (Hell, it’s long to OCR stuff). If and only if my files are legit.

        Just imagine if the Hasegawa novels were in that format WANT1

        What I want to know is : Is the unknown-OCR legit ? I think it is, see for yourself :

        (DL : Found on the 4ch*n database about Kajishima, link to JPDDL.com)

        Context : From what i understand, it’s OVA2 : Here comes Jurai 2, Kajishima talks about Asuza visiting the household, right before the duel i guess.

        Hope it’s ok to show a page of Jurai.. Here the first page of the prologue of the first novel
        [hidden]http://image.noelshack.com/fichiers/2015/16/1429049372-p005.png" />[/hidden]

        JP Text OCR by ChaudSept

        プロ ローグ

        樹雷皇が柾木家を訪れてから一週間が過ぎた。

        厳密に言うと訪れるょうになって一週間が過ぎている。樹雷皇は娘である第一皇女阿重 霞の結婚候補者を連れて来ては理不尽な現象にょって追い返されるという作業を七回繰り 返しているから、何のことはない毎日やつて来ているのである。その度に阿重霞が「お父 様、私には、天地様が……」と懇願しても「その天地とやらが私の選んだ男に決闘で勝た なければ認めることは出来ん」と毎度の科白を言うもんだから、これまた毎度のことなが ら柾木天地は八回目の決闘をすることと相なるのであつた。

        観衆は全部で八人と一匹、これまた八回とも同じだ。

        樹雷皇、阿主沙。

        樹雷第一皇妃、船穂。

        樹雷第二皇妃、美砂樹。

        note-

        I used Abby, I can’t say that it is 100% accurate, it never got reviewed by anyone but me.

        ATLAS Translation –

        Prologue

        One week passed after Jurai [**] had visited the Masaki person.

        One [natte] week has passed visit [ruyo;u] strictly speaking. When the marriage candidate of the first princess [**] heavy haze who is the daughter is brought, Jurai [**] : to an unjust phenomenon ..repeat work that [yo;tte] is sent away seven times.., [Yatsute] comes every day. Ayeka : in every case. “Father, To me, Tenchi……” Even if you entreat”If it doesn’t defeat a man whom the Tenchi and I chose by the duel, it is not possible to admit. “Because usual department white was massaged ..saying., It duels the eighth, aspect [naruno], and MasakiTenchi is [atsuta] though it is usual this.

        Spectators are eight people and 1 in all, It is the same as this and eight times.

        Jurai [**], Azusa。

        Jurai the first [**hi], Funaho。

        Jurai the second [**hi], Misaki。

        Free Translation –

        Prologue

        After Jurai 皇 visits a/the Masaki house 1 week passed.

        Rigidly speaking becoming ょう that visits 1 week is passing. 《主語なし》As for Jurai 皇, there is not what, because it is repeating the work that is sent away to an unreasonable phenomenon ょって that I bring the marriage candidate of the 1st princess 阿 heavy mist who is a daughter 7 times it is coming every day and つて. Ayeka “, a/the father, me to そ at each time of….” with even if Tenchi entreats MasakiTenchi is あつた because it becomes the that phase that does the 8th duel though it also is the case of always, because it rubbed and says the words of always that it is not possible to admit if it does not win with duel ら “the Tenchi with and the man who I chose.

        The audience is the same as 8 times also 1 head, with 8 people all told.

        Jurai 皇, Azusa.

        The 1st Jurai 皇 consort, Funaho.

        The 2nd Jurai 皇 consort, Misaki.

        JP Text OCR by John Doe

        [#見出し]プロローグ

         樹雷《じゅらい》皇《おう》が柾木《まさき》家《け》を訪《おとず》れてから一週間が過ぎた。

         厳密《げんみつ》に言うと訪れるようになって一週間が過ぎている。樹雷皇は娘《むすめ》である第一|皇女《おうじょ》阿重霞《あえか》の結婚《けっこん》候補《こうほ》者《しゃ》を連れて来ては理不尽《りふじん》な現象によって追い返されるという作業を七回|繰《く》り返しているから、何のことはない毎日やって来ているのである。その度《たび》に阿重霞が「お父様、私には、天地《てんち》様が……」と懇願《こんがん》しても「その天地とやらが私の選んだ男に決闘《けっとう》で勝たなければ認《みと》めることは出来ん」と毎度の科白《せりふ》を言うもんだから、これまた毎度のことながら柾木天地は八回目の決闘をすることと相なるのであった。

         観衆《かんしゅう》は全部で八人と一匹《いっぴき》、これまた八回とも同じだ。

         樹雷皇、阿主沙《あずさ》。

         樹雷第一|皇妃《おうひ》、船穂《ふなほ》。

         樹雷第二皇妃、美砂樹《みさき》。

        note-

        This OCR is surprising.

        First : The character between “《》 ” are the hiragana helping the japs for the pronunciation of the Kanji.

        Second : “|” Indicates to which Kanji(s) the hiraganas apply.

        Third : “#” Indicates stuff like changing pages, or presence of image, titles…

        Fourth : We need to suppress all the 《signs》 to get a proper machine translation.

        Fifth : I’ve never succeed in finding this kanji 柾 (which means Masaki) with Abby.

        JP Text OCR by John Doe – Ready for Machine

        プロローグ

        樹雷皇が柾木家を訪れてから一週間が過ぎた。

        厳密に言うと訪れるようになって一週間が過ぎている。樹雷皇は娘である第一皇女阿重霞の結婚候補者を連れて来ては理不尽な現象によって追い返されるという作業を七回繰り返しているから、何のことはない毎日やって来ているのである。その度に阿重霞が「お父様、私には、天地様が……」と懇願しても「その天地とやらが私の選んだ男に決闘で勝たなければ認めることは出来ん」と毎度の科白を言うもんだから、これまた毎度のことながら柾木天地は八回目の決闘をすることと相なるのであった。

        観衆は全部で八人と一匹、これまた八回とも同じだ。

        樹雷皇、阿主沙。

        樹雷第一皇妃、船穂。

        樹雷第二皇妃、美砂樹。

        ATLAS Translation –

        Prologue

        One week passed after Jurai [**] had visited the Masaki person.

        Strictly speaking, one week has passed since it comes to visit. When the marriage candidate of the first princess Ayeka who is the daughter is brought, Jurai [**] : ..repeat work of being sent away seven times by an unjust phenomenon.., It comes every day. Ayeka : in every case. “Father, To me, Tenchi……” Even if you entreat”If it doesn’t defeat a man whom the Tenchi and I chose by the duel, it is not possible to admit. “Because usual department white was massaged ..saying., The aspect became it the eighth duel of MasakiTenchi though it was usual this.

        Spectators are eight people and 1 in all, It is the same as this and eight times.

        Jurai [**], Azusa。

        Jurai the first [**hi], Funaho。

        Jurai the second [**hi], Misaki。

        Free Translation –

        Prologue

        After Jurai 皇 visits a/the Masaki house 1 week passed.

        Rigidly speaking 1 week when comes to visit is passing. 《主語なし》As for Jurai 皇, there is not what, because it is repeating the work that is sent away by an unreasonable phenomenon that I bring the marriage candidate of the 1st princess Ayeka who is a daughter 7 times it is coming along every day. Ayeka “, a/the father, me to そ at each time of….” with even if Tenchi entreats MasakiTenchi became ら “the Tenchi with and the man who I chose the that phase that does the 8th duel though it also is the case of always, because it rubbed and says the words of always that it is not possible to admit if it does not win with duel.

        The audience is the same as 8 times also 1 head, with 8 people all told.

        Jurai 皇, Azusa.

        The 1st Jurai 皇 consort, Funaho.

        The 2nd Jurai 皇 consort, Misaki.

        Technically there should no longer be a need for OCR software and being forced to manually proofread every character. As there should be official copies of all of Kajishima’s and Kasegawa’s novels & more readily available for easy text extraction, as most of this stuff is now available for sale as digital ebooks, on bookwalker. The files are protected AZW3 files, but it’s not too difficult to strip the DRM from and extract the text from afterwards. Biggest issue is that a growing number of ebooks sold on bookwalker & amazon are region locked, meaning the publisher won’t put them for sale to any IP range not originating from Japan. Which technically could be worked around, by using a VPN service, but that’s another story. Here’s some related examples.

        generic tenchi muyo search

        true tenchi trilogy

        true tenchi gxp

        paradise war (region locked)

        hasegawa novels

        tenchi in love novels

        darkness novel

        tokyo novel

        ai tenchi novels (region locked)

        ChaudSept
        Member

        shades of blue wrote:

        Technically there should no longer be a need for OCR software and being forced to manually proofread every character. As there should be official copies of all of Kajishima’s and Kasegawa’s novels & more readily available for easy text extraction, as most of this stuff is now available for sale as digital ebooks, on bookwalker. The files are protected AZW3 files, but it’s not too difficult to strip the DRM from and extract the text from afterwards. Biggest issue is that a growing number of ebooks sold on bookwalker & amazon are region locked, meaning the publisher won’t put them for sale to any IP range not originating from Japan. Which technically could be worked around, by using a VPN service, but that’s another story. Here’s some related examples.

        Thank you for the info and links, i didn’t know those things.

        But I guess it isn’t knew ? There should be a reason why nobody has done that yet. If not, well, i’ve got ideas. I mean it’s really cheap.

        Both Ai novels and WoG novels are Japan-only, but the rest seems buyable. That’s interesting.

        Do you think my jap-text file with all the “《おう》” and stuff comes from an ebook like that ?

        Also there is Eternal Memory. I already have the raw scans, don’t know if it’s possible to rip the text of a manga the same way we rip a novels…

        Anyway i take you infos as a serious solution, even though i could you use more opinions. After 20 years, most question i ask myself have already been asked or answered.

        Yukinojo
        Participant
        none
        I going to do some Translating of Jurai Novel here and there, not sure if I can put all the novel in this thread
        ChaudSept
        Member
        Which section will you work on?

        If you’re gonna do rough translations you could totally post it here I guess.

        I did a new thread for the part about Kagato and Asuza, but it was an almost full translation so.. (Maybe I should have put it at least in the Translation Zone, but heh.)

        I would be interessed in looking at what the novel has to say about Masaki.

        http://vignette3.wikia.nocookie.net/tenchi/images/3/35/Masakiazusa.jpg/revision/latest?cb=20120221193052" />

        I could help you like finding where in the novel it is and stuff

        Yukinojo
        Participant
        none

        ChaudSept wrote:

        Which section will you work on?If you’re gonna do rough translations you could totally post it here I guess.

        All of it, although it is 70 pages, and compared to the In Love novel it is trickier to clarify the sentences. I’m on the prologue.

        ChaudSept
        Member
        What motivation !

        As for how the text is delivered, yep you have to clean it first.

        This

        Quote:

        中央にある赤いコア・ブロックに巨大な樹木《じゅもく》の外壁《がいへき》ユニットを装備《そうび》しているその船は、極《きわ》めて特異《とくい》な部類のデザインであり、多くの人間がその形を見ればどの星から来たのかすぐに分かることだろう。いくら宇宙が広いとはいえ、|樹《き》で出来た宇宙船を持っている星は一つしかない。

        Becomes this :

        中央にある赤いコア・ブロックに巨大な樹木の外壁ユニットを装備しているその船は、極めて特異な部類のデザインであり、多くの人間がその形を見ればどの星から来たのかすぐに分かることだろう。いくら宇宙が広いとはいえ、樹で出来た宇宙船を持っている星は一つしかない。

        Every “” and “《blabla》” must disappear.

        And a quick look at the original scans isn’t superfluous.

        Yukinojo
        Participant
        none
        I’ve cleaned up the Jurai OCR text of mostly all the 《》 and |. I tried to keep the text to the page it original was, but the text got shifted when I was filtering it, some paragraphs got merged or moved. PM if you want the version
        Yukinojo
        Participant
        none
        Finally figured out

        司仕委《じじい》= shishiī = Jijī

        I just looked at each kanji too literally

        can either say old man or as it says: Jijī

        Yukinojo
        Participant
        none
        sorry for bump

        I am going to do a translation of Washu novel

        these books hurt to work on

        jgzinv
        Member
        A bajillion characters of text generally is a pain in the ass to work on.
        Almael
        Member
        It may get better after reaching the 8th or 15th or 23rd mark… Kiyofacepalm1

        OK, that’s probably a lie. kiyonesmile1

        Yukinojo
        Participant
        none

        Almael wrote:

        It may get better after reaching the 8th or 15th or 23rd mark… Kiyofacepalm1

        OK, that’s probably a lie. kiyonesmile1

        take the yosho novel, use google and romajidesu translate, because this stuff is annoying to translate

        Almael
        Member
        Well, it’s still just a Japanese language only Japanese novel. Not like the bi/tri-lingual 11 books of the Seikai universe. Of which there does not exist even a reliable dictionary. Or try the multi-lingual 15 books of LoGH… even norse writing is debatable.
        Yukinojo
        Participant
        none
        just a reminder for anyone

        Jurai novel: done, just needs a little cleanup

        Washu novel: first chapter done, needs to be straighten out some

        Yosho novel: not started, on the fence on whether it’s worth it or not to translate (don’t care much for Airi)

        jgzinv
        Member
        You might, take a break somewhere in there. That isn’t to say, take a break so much that you drop it.

        But as with previous history, novel translating is a very long arduous process, even more so for one person.

        Don’t get burned out.

        Granted Airi isn’t the most pleasant, but you might find other nuggets of information that are insightful.

        Like if I hadn’t started on gxp 13, we wouldn’t have known Dr. Clay was back in action.

        Yukinojo
        Participant
        none

        JGZinv wrote:

        You might, take a break somewhere in there. That isn’t to say, take a break so much that you drop it.

        not really as taking a break, more like losing interest. I want to keep going, but at the same time it’s gets boring. Though, translating Jurai helped confirmed for myself the little girl with the comb is Funaho sister.

        I did almost do all of Jurai and have done 24 out of 91 pages of Washu, and found out washu can fly, so might as well keep going

    Viewing 16 reply threads
    • You must be logged in to reply to this topic.