Welcome to the Invelos forums. Please read the forum rules before posting.

Read access to our public forums is open to everyone. To post messages, a free registration is required.

If you have an Invelos account, sign in to post.

    Invelos Forums->DVD Profiler: Contribution Discussion Page: 1... 4 5 6  Previous   Next
Parsing of Asian Names
Author Message
DVD Profiler Desktop and Mobile RegistrantStar ContributorTheMadMartian
Alien with an attitude
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 13,201
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Jubal:
Quote:
I do wonder if Ken's view of the name fields (which is not what was intended, but it's his progarm) could be interpreted to allow for use of CLT for Asian Names, Unicus.

Skip

It is possible, but I am leaning towards 'no' as he stated that his view on name fields was not to be considered a directive to start changing Asian names around. 
No dictator, no invader can hold an imprisoned population by force of arms forever.
There is no greater power in the universe than the need for freedom.
Against this power, governments and tyrants and armies cannot stand.
The Centauri learned this lesson once.
We will teach it to them again.
Though it take a thousand years, we will be free.
- Citizen G'Kar
DVD Profiler Unlimited RegistrantStar ContributorDanae Cassandra
Registered: Apr 11, 2004
Registered: May 26, 2007
Reputation: Great Rating
United States Posts: 2,878
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Speaking of difficult parsing... these are how credits are listed for the film Mongol.  Since I had no personal knowledge when transcribing, I simply entered the credits as they appeared on screen and parsed -> first // all rest // last.

If anyone knows the cultural origin please tell me.    Or if you know how they should be parsed, let me know and I'll fix them.  The film is a joint Kazakhstan/Germany/Mongolia/Russia production, with most filming taking place in Mongolia & China. 

Ji Ri Mu Tu
Hong Jong Ba Tu
E Er Deng Ba Te Er
Su You Le Si Ren

I just bring these up to show that this isn't just a China/Korea/Japan issue - no agenda other than wanting to point out this is a multifaceted problem and may not have an easy solution.

In looking up the main actress, Khulan Chuluun, it seems Mongolian names display as Japanese names (patronymic first, personal name second - so it would be Chuluun Khulan, technically) but since her name was displayed in Western order in the credits, that's what I went with.

Then there is the issue of other alphabets and the use of these in credits.  Synner_man has already touched on that, but again, that also isn't limited to China/Korea/Japan, but is shared with anything using an alphabet that differs from English - the Cyrillic alphabet (which is the reason my profile for Battleship Potemkin has no credits) being one example.
If more of us valued food and cheer and song above hoarded gold, it would be a merrier world.
-- Thorin Oakenshield
 Last edited: by Danae Cassandra
DVD Profiler Unlimited Registrantnuoyaxin
prev. known as ya_shin
Registered: March 13, 2007
Reputation: High Rating
Taiwan, Province of China Posts: 3,436
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting m.cellophane:
Quote:
We talked about it.

From the rules team chat transcripts:

Thanks for reminding me that we indeed tried to cover the issue. Seems the situation was not much different than it is now, Hollywood being most important in the world.

Quoting Unicus69:
Quote:
...or maybe I am just getting soft in my old age, in which case you can all ignore me. 

Since discussion went on for several pages it would appear that you are getting soft in your old age

Quoting Ace_of_Sevens:
Quote:
My suggestion was enter Fat//Chow Yun [Chow Yun Fat].

I suppose you mistyped and actually meant:
Yun Fat//Chow [Chow Yun Fat]

Perfect example why proper parsing should be discussed. It's Mr. Chow, not Mr. Fat. His given name is Yun Fat.

Also, I want to repeat that I agree with the fact (and most people know I have lived in Asia for 12 years now) that in Chinese language the middle name concept does not exist. Not even that "Yun" could be considered given name and "Fat" the middle name, it is clearly "Yun Fat" together and together only.
Achim [諾亞信; Ya-Shin//Nuo], a German in Taiwan.
Registered: May 29, 2000 (at InterVocative)
 Last edited: by nuoyaxin
DVD Profiler Unlimited RegistrantStar ContributorAce_of_Sevens
Registered: December 10, 2007
Reputation: High Rating
Posts: 3,004
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting ya_shin:
Quote:
Perfect example why proper parsing should be discussed. It's Mr. Chow, not Mr. Fat. His given name is Yun Fat.

Also, I want to repeat that I agree with the fact (and most people know I have lived in Asia for 12 years now) that in Chinese language the middle name concept does not exist. Not even that "Yun" could be considered given name and "Fat" the middle name, it is clearly "Yun Fat" together and together only.


Yeah. I think i got it correct the first time.

Bruce Lee and Jackie Chan, while Chinese people, do not have Chinese names (or they do, but these aren't them). This is readily apparent. CLT doesn't tell us how to parse, which is sort of the main issue here. I should also point out it is not uncommon to have dual Chinese/English credits and the names written differently in the two languages. CLT won't help with that either.
DVD Profiler Unlimited RegistrantStar ContributorWinston Smith
Don't be discommodious
Registered: March 13, 2007
United States Posts: 21,610
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Ace:

Now you are picking something that has nothing to do with profiler. Well, it does but... Profiler at this point in time cannot deal with Chinese Character Set, nior Kanji, nor Cyrillic, no several others, so the best you can hope for is a translatyon

Skip
ASSUME NOTHING!!!!!!
CBE, MBE, MoA and proud of it.
Outta here

Billy Video
DVD Profiler Desktop and Mobile RegistrantGraveworm
Registered: April 7, 2007
United Kingdom Posts: 357
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Ace_of_Sevens:
Quote:
Quoting Jubal:
Quote:
What suggestion, Ace? I don't see one.


Sorting correctly means sort by family name. This is standard in all countries that use one, AFAIK, regardless of how the name is written


Ehm in China they mostly sort by given name For obvious reasons.
 Last edited: by Graveworm
DVD Profiler Desktop and Mobile RegistrantStar ContributorTaro
Registered: February 23, 2009
Reputation: High Rating
Belgium Posts: 1,580
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Unicus69:
Quote:
Ken has indicated that 'last name' means 'surname'.  It doesn't take a giant leap to believe that 'middle name' means, well, 'middle name'.  For eastern names, where they do not have a middle name, that field should be left blank.  Where the documentation is needed, is to show where each part goes...'12/ /3' or '1/ /23'.

That's a good suggestion, I would think. Just look up and provide some kind of documentation. I must admit that I'm guilty of sometimes not providing enough documentation for Japanese cast members, since it all comes just as natural to me as for example saying that Tom Cruise is Tom//Cruise. I'll make a mental note of it for future submissions that I'll provide some form of documentation in my contribution notes.
Blu-ray collection
DVD collection
My Games
My Trophies
DVD Profiler Unlimited RegistrantStar ContributorWinston Smith
Don't be discommodious
Registered: March 13, 2007
United States Posts: 21,610
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Taro

Skip
ASSUME NOTHING!!!!!!
CBE, MBE, MoA and proud of it.
Outta here

Billy Video
DVD Profiler Unlimited RegistrantDraxen
I see shiny discs...
Registered: March 13, 2007
Finland Posts: 681
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Danae Cassandra:
Quote:
Speaking of difficult parsing... these are how credits are listed for the film Mongol.  Since I had no personal knowledge when transcribing, I simply entered the credits as they appeared on screen and parsed -> first // all rest // last.

If anyone knows the cultural origin please tell me.    Or if you know how they should be parsed, let me know and I'll fix them.  The film is a joint Kazakhstan/Germany/Mongolia/Russia production, with most filming taking place in Mongolia & China. 

Ji Ri Mu Tu
Hong Jong Ba Tu
E Er Deng Ba Te Er
Su You Le Si Ren


I got a headache by just reading those names     

That's exactly what I was talking about in my previous message about names that consist of 4 or more parts while we have "only" 3 name fields. Correct parsing requires cultural knowledge, there's no way around it.

I would have done exactly the same as you: e.g. "E / Er Deng Ba Te / Er". If someone then corrects the contribution with knowledge of how that parsing really should go, great.

edit: typo
Mika
I hate people who love me, and they hate me. (Bender Bending Rodriguez)
 Last edited: by Draxen
DVD Profiler Unlimited RegistrantStar ContributorDarklyNoon
No Godz, No Masterz
Registered: May 8, 2007
Reputation: Highest Rating
Germany Posts: 1,945
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Unicus69:
Quote:
Quoting Taro:
Quote:
I think the argument that culture should be left out of credit input is a valid point. But that brings up an interesting question:

middle name: is that just a word that is in the middle of the credit? Then this indeed is unrelated to culture and whenever we see three words, we should input that name as X/Y/Z

However, if middle name is the western concept of a middle name as it appears on a passport, then willing or not, culture automatically enters into it, as we try to force a standard from western culture on non-western names.

I don't know how Ken intented that field to be used, but if it is the western concept of a middle name, then culture automatically becomes an issue, as DarklyNoon pointed out.

Ken has indicated that 'last name' means 'surname'.  It doesn't take a giant leap to believe that 'middle name' means, well, 'middle name'.  For eastern names, where they do not have a middle name, that field should be left blank.  Where the documentation is needed, is to show where each part goes...'12/ /3' or '1/ /23'.


Exactly what I think
www.tvmaze.com
DVD Profiler Unlimited Registrantnuoyaxin
prev. known as ya_shin
Registered: March 13, 2007
Reputation: High Rating
Taiwan, Province of China Posts: 3,436
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Quoting Graveworm:
Quote:
Ehm in China they mostly sort by given name For obvious reasons.

While I am not exactly in China (although this may be arguable...) I am not sure where this comes from (unless it's sarcasm, then ignore me).

Sorting in a phone book or similar is of course still done by family name first, then by given name. Just like in Western countries.

By the way, sorting is firstly done via "strokes", so, the more "lines" are needed to write the character, the further down in the list the name will be. ...and now I am curious how priority is set between names with same stroke count (not sure why I didn't ask this before...).
Achim [諾亞信; Ya-Shin//Nuo], a German in Taiwan.
Registered: May 29, 2000 (at InterVocative)
DVD Profiler Desktop and Mobile RegistrantGraveworm
Registered: April 7, 2007
United Kingdom Posts: 357
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting ya_shin:
Quote:
Quoting Graveworm:
Quote:
Ehm in China they mostly sort by given name For obvious reasons.

While I am not exactly in China (although this may be arguable...) I am not sure where this comes from (unless it's sarcasm, then ignore me).

Sorting in a phone book or similar is of course still done by family name first, then by given name. Just like in Western countries.

By the way, sorting is firstly done via "strokes", so, the more "lines" are needed to write the character, the further down in the list the name will be. ...and now I am curious how priority is set between names with same stroke count (not sure why I didn't ask this before...).


As I said MOSTLY by given name. Of course any directory is broken doen into family names but that is almost pointless as there are towns in China where everyone has the same family name. With over a billion people sharing 50 family names it is next to useless. So effectively it is sorted on given name as that is where the index comes in which leads me onto your second point. 

Characters have a radical, the first drawn part of a character. This is the index and whilst it's the number of strokes it's very arbitary and the only way to sort it is to use an index of radicals. Having done that it gives you a cross reference to the second drawn part. Again there is an index, this then gives you the page of the actual entries that start with that character. So to find a name in a phone book you use an index to take you to the page and that is a sub set of the family names.
This process is a whole subject in chinese schools and it goes on for a good few years of their studies in primary schools.

In Korea until relatively recently you could not marry anyone with the same family name. And there most people have one of 5 names!
 Last edited: by Graveworm
DVD Profiler Unlimited RegistrantStar ContributorWinston Smith
Don't be discommodious
Registered: March 13, 2007
United States Posts: 21,610
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Grave:

What you describe is actually still quite common and until relatively recently was commonplace. I found a village in Germany recently which contained a very LARGE percentage of the population that shares what appears to be variant on my last name, I presume this likely to be my families place of Origin relative to my father. Again until relatively recently, we nearly all were born , grew up and died within a five mile radius and in some parts of the world this is still true.

There are commonalities in all naming structures worldwide, that if one is into such things, say a lot.

Skip
ASSUME NOTHING!!!!!!
CBE, MBE, MoA and proud of it.
Outta here

Billy Video
DVD Profiler Unlimited Registrantnuoyaxin
prev. known as ya_shin
Registered: March 13, 2007
Reputation: High Rating
Taiwan, Province of China Posts: 3,436
Posted:
PM this userEmail this userVisit this user's homepageView this user's DVD collectionDirect link to this postReply with quote
Thank you for the detailed information, Graveworm

Also, I see now what you meant, since there are only a "few" family names in use they are technically sorting by given name...
Achim [諾亞信; Ya-Shin//Nuo], a German in Taiwan.
Registered: May 29, 2000 (at InterVocative)
    Invelos Forums->DVD Profiler: Contribution Discussion Page: 1... 4 5 6  Previous   Next