Welcome to the Invelos forums. Please read the forum rules before posting.

Read access to our public forums is open to everyone. To post messages, a free registration is required.

If you have an Invelos account, sign in to post.

    Invelos Forums->DVD Profiler: Contribution Discussion Page: 1 2 3 ...6  Previous   Next
Parsing of Asian Names
Author Message
DVD Profiler Unlimited RegistrantStar Contributorninehours
Registered: April 3, 2007
Reputation: High Rating
United Kingdom Posts: 1,998
Posted:
PM this userDirect link to this postReply with quote
Hopefully this won't end up being a 20 page pointless argument    just want to ask a (hopefully) simple question.
There was a recent update to a profile that changed the parsing of two names from 1/2/3 to 1/ /23, i think i remember someone posting in the forums that Asian names do not have middle names. i asked a question about the contribution that got no response, so am hoping someone who knows about Asian names can answer it, the question was
Quote:
How do you know that the 2nd name is part of the 3rd name and not part of the 1st name
12/ /3 and not 1/ /23
DVD Profiler Desktop and Mobile RegistrantStar ContributorTaro
Registered: February 23, 2009
Reputation: High Rating
Belgium Posts: 1,580
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
I wouldn't put all Asian languages in the same boat (that would be like asking when to use capitals in all western languages). However, I can answer your question for Japanese names: no, they don't have middle names

Judging by the nature of your question, I gather this is rather aimed and Chinese or Korean names (e.g. Yun Young Park), but I'm afraid I can't help you there.
Blu-ray collection
DVD collection
My Games
My Trophies
 Last edited: by Taro
DVD Profiler Unlimited RegistrantStar ContributorWinston Smith
Don't be discommodious
Registered: March 13, 2007
United States Posts: 21,610
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Given Taro's comment then I think the answer is obvious. Use the default 1/2/3 unless you or someone else can provide documentation to the contrary. It's not a perfect answer, I know, but we start somewhere.

We are not a database which would be mission critical based upon parsing, such as, oh let's say the Government Tax database, as long as the data appears as it does on screen, then we are good, and if someone can provide documentation to support something else that leaves the data still APPEARING as it does On Screen we are still good.

Skip
ASSUME NOTHING!!!!!!
CBE, MBE, MoA and proud of it.
Outta here

Billy Video
 Last edited: by Winston Smith
DVD Profiler Desktop and Mobile RegistrantStar ContributorTaro
Registered: February 23, 2009
Reputation: High Rating
Belgium Posts: 1,580
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Did some quick digging for Korean names, and here's what wikipedia has to say about it:
"The family name is typically a single syllable, and the given name two syllables. There is no middle name in the Western sense."
http://en.wikipedia.org/wiki/Korean_name

Given the fact they had to adopt the Japanese naming system at some point in time, that would make sense to me.

So my example with 'Yun Young Park' would be:
first name: Yun Young
last name: Park

Take it for what it's worth of course ...
Blu-ray collection
DVD collection
My Games
My Trophies
 Last edited: by Taro
DVD Profiler Unlimited RegistrantStar Contributorsynnerman
Take me with you. Please.
Registered: March 13, 2007
United States Posts: 736
Posted:
PM this userDirect link to this postReply with quote
Taro is correct.  Koreans do not use a middle name.  The given name is two parts.  For example, brothers Ryoo Seung-wan and Ryoo Seung-beom.  Ryoo is the family name.  It would be parsed (and credited) as Ryoo//Seung-wan and Ryoo//Seung-beom.
DVD Profiler Unlimited RegistrantStar Contributorrorymatt
Registered: March 24, 2007
Reputation: High Rating
United States Posts: 2,044
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
I spent two years stationed in Korea, and synner_man and Taro are quite correct in their statements and parsing examples.

Rory
DVD Profiler for iOS as of 3/5/2013
DVD Profiler for Android as of 5/17/2013
DVD Profiler Unlimited RegistrantStar ContributorWinston Smith
Don't be discommodious
Registered: March 13, 2007
United States Posts: 21,610
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Taro:
Quote:
Did some quick digging for Korean names, and here's what wikipedia has to say about it:
"The family name is typically a single syllable, and the given name two syllables. There is no middle name in the Western sense."
http://en.wikipedia.org/wiki/Korean_name

Given the fact they had to adopt the Japanese naming system at some point in time, that would make sense to me.

So my example with 'Yun Young Park' would be:
first name: Yun Young
last name: Park

Take it for what it's worth of course ...


Taro:

I hope you are saying that the credit we would see On Screen is Yun Young Park, if it's not well then we are back to square one. The goal of profiler is to "replicate" the appearance of the data, not to deal with culture, the culture is the credit. If you are back to suggesting cultural norms then we aren't there yet, and frankly I am not sure what the answer is, unless Ken wants to let use the CLT for such things...and that is his call.

Skip
ASSUME NOTHING!!!!!!
CBE, MBE, MoA and proud of it.
Outta here

Billy Video
DVD Profiler Unlimited RegistrantStar Contributorninehours
Registered: April 3, 2007
Reputation: High Rating
United Kingdom Posts: 1,998
Posted:
PM this userDirect link to this postReply with quote
Quoting synner_man:
Quote:
Taro is correct.  Koreans do not use a middle name.  The given name is two parts.  For example, brothers Ryoo Seung-wan and Ryoo Seung-beom.  Ryoo is the family name.  It would be parsed (and credited) as Ryoo//Seung-wan and Ryoo//Seung-beom.

This is close to what i mean if this name was in the credits but "Ryoo Seung Beom" or "Seung Beom Ryoo" how would i as a average user with no knowledge of Asian names know where the 2nd name should be parsed? 
Thanks for the replies, think for now it's best to do what Skip said and just default to the basic 1/2/3 parsing until someone with knowledge of Asian names comes along to correct the profile
DVD Profiler Desktop and Mobile RegistrantStar ContributorTaro
Registered: February 23, 2009
Reputation: High Rating
Belgium Posts: 1,580
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Well, I guess Skip and I will just have to agree to disagree on the cultural element. Ignoring cultural elements and official grammar rules is, in my humble opinion, the fastest track to enter erronous or inconsistent data in a database.

Leaving Korean grammar out of the equation, means that the same name could be enter in a multitude of variations:
Yun / Young / Park
Yun Young // Park
Yun // Young Park
And I'm not even starting with variations by switching the last & first name field.

Worse even, by continuing the way we are doing now, we could possibly create CLT results that are incorrect. Taking the above example, we could end up with CLT results that have Yun / Young / Park as the most commonly used name, which in fact is incorrect since Young is not a middle name. It would only help propagate incorrect data.

The way I see it, if the name field in DVDP was just one field, then I would agree with Skip that just copying what is in the credits is sufficient.
However, it's not. We have 3 fields (first, middle, last) and it would only seem logical to me to use those fields addiquately, namely enter a middle name only if there is one. Unfortunately, credits don't expressly show which is what, so the only way to handle this would be to rely on cultural data. If we don't do that, then what's the use of having 3 seperate fields? Might as well throw it all together into one big field.

Still with the above example, let's say I input it as follows:
first: Yun Young
last: Park
This is culturally correct data but also correct copy-paste data from the credits, since that's who his name was spelled in the credits. Isn't that the best of both worlds?

The problem is that the rules currently don't provide for this. This is why I've submitted a draft, which is only a proposal of course, on how to deal with Japanese names and titles. Until such matters are resolved in the rules, I'm afraid that the door remains open for inconsistent and even erronous data to be entered.
Blu-ray collection
DVD collection
My Games
My Trophies
 Last edited: by Taro
DVD Profiler Unlimited RegistrantStar ContributorNexus the Sixth
Contributor since 2002
Registered: March 13, 2007
Reputation: High Rating
Sweden Posts: 3,196
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Get rid of parsing NOW!
First registered: February 15, 2002
DVD Profiler Unlimited RegistrantStar Contributorpaulb_99
PSN-ID: Magnolia-Fan
Registered: March 14, 2007
Netherlands Posts: 863
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Kinoniki:
Quote:
Get rid of parsing NOW!


Comepletely agree with this, as i said many times before when this issue came up. Some people disagree unfortunately because they loose last name soring. Why they need such is thins never became clear to me though.

My opinion is simple, no parsing, copy what you see and use a common name where needed. It couldn't be simpler.

Paul
DVD Profiler Unlimited RegistrantStar ContributorAddicted2DVD
Registered: March 13, 2007
Reputation: Highest Rating
United States Posts: 17,330
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
There is several of us that said the same thing Paul... but unfortunately Ken has said he don't want to do this for one reason or another.
Pete
DVD Profiler Unlimited RegistrantStar ContributorWinston Smith
Don't be discommodious
Registered: March 13, 2007
United States Posts: 21,610
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
I was afraid of that, taro. Profiler is ONLY trying to appear as it does ON SCREEN, so cultural issues that result in something else are completely IRRELEVANT PERIOD. If Asian actor/crew are upset that e replicate the APPEARANCE of their name On Screen, instead of bowing to their culture, perhaps they should only appear in movies which deal with other Asians, instead of  appearing with Sean Willam Scott or any other Actors that parse their names differently from themselves. We are not creating a cultural database, we are creating a FILM database.       GOD, I get tired of this and the....NO I won't say it.

Skip
ASSUME NOTHING!!!!!!
CBE, MBE, MoA and proud of it.
Outta here

Billy Video
DVD Profiler Unlimited RegistrantStar Contributorsynnerman
Take me with you. Please.
Registered: March 13, 2007
United States Posts: 736
Posted:
PM this userDirect link to this postReply with quote
Quoting Taro:
Quote:
Well, I guess Skip and I will just have to agree to disagree on the cultural element. Ignoring cultural elements and official grammar rules is, in my humble opinion, the fastest track to enter erronous or inconsistent data in a database.

Leaving Korean grammar out of the equation, means that the same name could be enter in a multitude of variations:
Yun / Young / Park
Yun Young // Park
Yun // Young Park
And I'm not even starting with variations by switching the last & first name field.

Worse even, by continuing the way we are doing now, we could possibly create CLT results that are incorrect. Taking the above example, we could end up with CLT results that have Yun / Young / Park as the most commonly used name, which in fact is incorrect since Young is not a middle name. It would only help propagate incorrect data.

The way I see it, if the name field in DVDP was just one field, then I would agree with Skip that just copying what is in the credits is sufficient.
However, it's not. We have 3 fields (first, middle, last) and it would only seem logical to me to use those fields addiquately, namely enter a middle name only if there is one. Unfortunately, credits don't expressly show which is what, so the only way to handle this would be to rely on cultural data. If we don't do that, then what's the use of having 3 seperate fields? Might as well throw it all together into one big field.

Still with the above example, let's say I input it as follows:
first: Yun Young
last: Park
This is culturally correct data but also correct copy-paste data from the credits, since that's who his name was spelled in the credits. Isn't that the best of both worlds?

The problem is that the rules currently don't provide for this. This is why I've submitted a draft, which is only a proposal of course, on how to deal with Japanese names and titles. Until such matters are resolved in the rules, I'm afraid that the door remains open for inconsistent and even erronous data to be entered.


The one problem I have with your example is that the name would likely be credited as Park Yun Young, in other words, family name first.  That is the way the majority of Korean films are credited.  You would not switch it around just to match a western standard.  In other words, if it is credited as Park Yun Young, it is parsed as Park//Yun Young, not Yun Young//Park.

As a sidenote, the bigger problem is that many Korean films lack any western names at all.
DVD Profiler Desktop and Mobile RegistrantStar ContributorTaro
Registered: February 23, 2009
Reputation: High Rating
Belgium Posts: 1,580
Posted:
PM this userView this user's DVD collectionDirect link to this postReply with quote
Skip:

I don't have a problem with inputting the data exactly as it appears onscreen. If that is the general rule DVDP wants to go by, I wouldn't mind at all. However, in that case can you answer me these two questions please?

1. If all we are concerned about is entering data as it appears on-screen, then why is the cast & crew data split up into three fields (first, middle, last)? On-screen credits don't make a distinction between those, so all it would require from DVDP is to have just one field, not three.

2. I see the current rules provide for even the tiniest of details for western names, even going as far as telling us how to parse military ranks or how to deal with Jr. and Sr. abreviations. DVDP has gone to great length to explain how to parse western names, yet for Asian names, even the most basic parsing information (first and last name parsing, just to name one) is completely missing. Isn't that a bit of a double standard? On the one hand you say we're not dealing with a cultural database but for western names, that's exactly what we are already doing: dealing with these matters from a western point of view. If you are consequent, then you would have to advocate the removal of all existing parsing rules, including those for western names (Jr, Sr, Dr, Sir, etc all should just be entered as seen on-screen)

Finally, I'd like to point out that you are again taking a condescending tone towards other cultures. I won't go as far as to call you a racist but your mind is very narrowly focussed on western culture. I find remarks such as:
"bowing to their culture"
"perhaps they should only appear in movies which deal with other Asians"
completely inappropriate and offensive. I've never taken a condescending tone towards the parsing of western names in DVDP and I would expect the same form of respect from other users.

I am more than willing to discuss how to deal with parsing of names (any kind of names, western or other) but I will refuse to continue if you utter one more redneck remark towards non-western cultures.
Blu-ray collection
DVD collection
My Games
My Trophies
DVD Profiler Unlimited RegistrantStar ContributorWinston Smith
Don't be discommodious
Registered: March 13, 2007
United States Posts: 21,610
Posted:
PM this userEmail this userView this user's DVD collectionDirect link to this postReply with quote
Quoting Taro:
Quote:
Skip:

I don't have a problem with inputting the data exactly as it appears onscreen. If that is the general rule DVDP wants to go by, I wouldn't mind at all. However, in that case can you answer me these two questions please?

1. If all we are concerned about is entering data as it appears on-screen, then why is the cast & crew data split up into three fields (first, middle, last)? On-screen credits don't make a distinction between those, so all it would require from DVDP is to have just one field, not three.

2. I see the current rules provide for even the tiniest of details for western names, even going as far as telling us how to parse military ranks or how to deal with Jr. and Sr. abreviations. DVDP has gone to great length to explain how to parse western names, yet for Asian names, even the most basic parsing information (first and last name parsing, just to name one) is completely missing. Isn't that a bit of a double standard? On the one hand you say we're not dealing with a cultural database but for western names, that's exactly what we are already doing: dealing with these matters from a western point of view. If you are consequent, then you would have to advocate the removal of all existing parsing rules, including those for western names (Jr, Sr, Dr, Sir, etc all should just be entered as seen on-screen)

Finally, I'd like to point out that you are again taking a condescending tone towards other cultures. I won't go as far as to call you a racist but your mind is very narrowly focussed on western culture. I find remarks such as:
"bowing to their culture"
"perhaps they should only appear in movies which deal with other Asians"
completely inappropriate and offensive. I've never taken a condescending tone towards the parsing of western names in DVDP and I would expect the same form of respect from other users.

I am more than willing to discuss how to deal with parsing of names (any kind of names, western or other) but I will refuse to continue if you utter one more redneck remark towards non-western cultures.


First off, Taro, that IS the Rule AND the intent.

"exactly as they are in the credits" These words are not difficult to understand.

You other questions have been addressed and answered ad nauseum for over FOUR years. I am not going to bother again.

Now as I have also said previously I understand your desires, and the only current way to deal with your concerns is by use of  CLT, and if that is OK with Ken it's Ok by me, but he has to make that call.

Taro, I am sorry if you do not get the answer that you WANT but you get the ANSWER. Right now, the only thing that you can do is use the CLT locally or list the data however you want it to be LOCALLY.

You can complain it about it all you want, but that's the deal. i don't really care about your attempts to rationalize your position so that you can twist the data to your desires. Like I said this not a database built around names, nor is it mission critical with respect to same, it IS a database that is built around CREDITs. IMDb will accommodate whatever you want to do and won't even ask you to prove it.

I get very weary of user who simply are not interested in how it is intended to be, but simply want to argue and spin for their own agenda.

Let me make, clear Taro, that I want the program to be able to accommodate your wishes and I have explained how that can be done now if Ken will sign off on it, but that is his choice. All ican do is describe what the objective and I don't really care about rationalizations, ok, so let's work on getting Ken to allow the CLT to be used OR to come up with some other answer, that does not allow for user-invented data which is different from what IS. I'm all for an answer for you to achieve what you want, just NOT the way you WANT it done. So stop arguing with me.

Skip
ASSUME NOTHING!!!!!!
CBE, MBE, MoA and proud of it.
Outta here

Billy Video
    Invelos Forums->DVD Profiler: Contribution Discussion Page: 1 2 3 ...6  Previous   Next