Log in

View Full Version : Help! mathematics or statistics



white camellia
05-02-2008, 12:09 PM
Any help would be greatly appreciated!!!

There are a, b, c, d, e five strategies we can use for the translation of a special kind of words from a book. This kind of words are too many to list them all so that we can only sample them for the study of how they are translated by two different translators, translator A (his translation is named TTa) and translator B (his translation is named TTb). The sampling principle is generally random, based on the criterion that only when the translations of one word are different by the two translators, would the word and its two translations be sampled. Now we have gathered 42 items (words). It is often the case that several strategies are applied to the translation of one word at the same time from one translator. For example, a, b, c are used for the German translation of "Heaven" by translator A. Now we have gathered that, by translator A, a is used 8 times, b 17 times, c 20 times, d 13 times and e twice while by translator B, a is used 12 times, b 34 times, c 7 times, d 10 times and e once. If we give a percentage to each strategy like this:
by translator A, a is used with this percentage: 8 times/42 items = 19%
by translator B, a is used with this percentage: 12 times/42 items = 29%
by translator A, b is used with this percentage: 17 times/42 items = 40%
by translator B, b is used with this percentage: 34 times/42 items = 81%
by translator A, c is used with this percentage: 20 times/42 items = 48%
by translator B, c is used with this percentage: 7 times/42 items = 17%
by translator A, d is used with this percentage: 13 times/42 items = 31%
by translator B, d is used with this percentage: 10 times/42 items = 24%
by translator A, e is used with this percentage: 2/42 = 5%
by translator B, e is used with this percentage: 1/42 = 2%

Now can we get the conclusion that translator A prefers the translation strategy c while translator B prefers b throughout translating all this special kind of words in the book, not only the 42 items sampled here.

Is this scientific?

Or how about this:
for a used by the two translators: 8 times/12 times = 0.67
for b used by the two translators: 17 times/34 times = 0.5
for c used by the two translators: 20 times/7 times = 2.86
for d used by the two translators: 13 times/10 times = 1.3
for e used by the two translators: 2/1=2

From this can we get the same conclusion? Is this scientific?

Virgil
05-02-2008, 07:38 PM
Is this scientific?

From this can we get the same conclusion? Is this scientific?

Very intersting Camellia. I think you are showing trends and if you were a good statistician (I'm not) you would be able to assign a confidence level based on the number of data points and variables. There are staistical tests (I don't have my statistics book at home to look it up.) that one does to the data to show correlations. Is this scientific? The material doesn't deal with science but it's in the manner of a scientist.

If you're interested in looking at statistical tests, Wiki seems to have a good site: http://en.wikipedia.org/wiki/Statistical_hypothesis_testing.

To be honest trying to understanding statistics is enough to make your head spin. I had a college class and then five or six years ago I took a refresher course because I was coming across more and more statistics at work. It drives me batty. Too much mumbo-jumbo. But if statistician were to show you, and walk you through the equations slowly, it's actually not that difficult.

papayahed
05-02-2008, 07:53 PM
I may be missing something here, the numbers don't add up. There are 42 instances but the number of times A used the bifferent translations is 60 times and B is 64 times.


A % B %
a 8 13.33 12 18.75
b 17 28.33 34 53.125
c 20 33.33 7 10.9375
d 13 21.67 10 15.625
e 2 3.33 1 1.5625
sum 60 100 64 100

Virgil
05-02-2008, 09:34 PM
I may be missing something here, the numbers don't add up. There are 42 instances but the number of times A used the bifferent translations is 60 times and B is 64 times.


A % B %
a 8 13.33 12 18.75
b 17 28.33 34 53.125
c 20 33.33 7 10.9375
d 13 21.67 10 15.625
e 2 3.33 1 1.5625
sum 60 100 64 100


Then it failed the very first statistical test. ;)

white camellia
05-02-2008, 09:42 PM
Virgil and Papayahed, must the times be the same? But it's very likely to be different since the five strategies can be applied differently by the two translators, whether with a single strategy a time or serveral in combination a time.

papayahed
05-02-2008, 10:56 PM
Virgil and Papayahed, must the times be the same? But it's very likely to be different since the five strategies can be applied differently by the two translators, whether with a single strategy a time or serveral in combination a time.


I'm not sure of what the information is telling me. There are 2 translators A and B. When the translators use a different from each other the method used is counted, right?

Where does the number 42 come from? If you look at your original percentages both A and B both add up to greater then 100 so you're not really getting an accurate percentage.

white camellia
05-02-2008, 11:46 PM
I'm not sure of what the information is telling me. There are 2 translators A and B. When the translators use a different from each other the method used is counted, right?

Where does the number 42 come from? If you look at your original percentages both A and B both add up to greater then 100 so you're not really getting an accurate percentage.
42 is the number of samples I got from the book. Each sample is a word or word groups. They are selected randomly. But I tend to select those which are translated differently by the two translators. One word or word group is often translated with serveral methods, i.e. a, c and e. As long as one method is applied by one translator, whether the other translator uses it or not, it is counted. The sum of the original percentage is greater than 100, and I think that's because several methods can be used at the same time for the translation of one word or word group (as one of the 42 samples).

white camellia
05-02-2008, 11:51 PM
I think your way of making the percentage is right. The fact that translator A used the five methods more times than translator B seems to show that translator A uses more often different strategies in combination.

papayahed
05-03-2008, 11:33 AM
Or how about this:
for a used by the two translators: 8 times/12 times = 0.67
for b used by the two translators: 17 times/34 times = 0.5
for c used by the two translators: 20 times/7 times = 2.86
for d used by the two translators: 13 times/10 times = 1.3
for e used by the two translators: 2/1=2

From this can we get the same conclusion? Is this scientific?

This isn't really telling you anything. What you could compare is the percentages of methods used between each translator (more of a normalized comparison)

ie. Translator A perfers method c 33% of the time compared with Translator B using method c only 10% of the time.

tractatus
05-03-2008, 11:55 AM
I agree with post 9.

1 question;
on your first table, you ask for "42" words to your TTa, but it brings you "60" answers, more than your input. So for some words, TTa chose multiple outcome?

samercury
05-03-2008, 12:22 PM
Since the sample follows the normal distribution, can't you compare the expected value for each of the possible translations percentages with the observed ones and solve it as a proportion problem (comparing the z-calc with z-critical)?

Virgil
05-03-2008, 12:44 PM
Since the sample follows the normal distribution, can't you compare the expected value for each of the possible translations percentages with the observed ones and solve it as a proportion problem (comparing the z-calc with z-critical)?

OMG, Same knows statistics. :eek:

white camellia
05-03-2008, 01:03 PM
I agree with post 9.

1 question;
on your first table, you ask for "42" words to your TTa, but it brings you "60" answers, more than your input. So for some words, TTa chose multiple outcome?

1. Taoist priest
2. spirit
3. the Milky Way
4. fairyland
.
.
.
42. shrine

Now if for the Russian translation of the first sample ('Taoist priest') from the original book, three translation strategies (a, b and d) are used by translator A. Since each time of the use of any strategy (there are 5 strategies in all for choice) by each translator is counted into his category, now the 'input' is '1' (one of the 42), but the 'outcome' is '3' (one of the 60).

white camellia
05-03-2008, 01:24 PM
Since the sample follows the normal distribution, can't you compare the expected value for each of the possible translations percentages with the observed ones and solve it as a proportion problem (comparing the z-calc with z-critical)?
Very good! Actually, what you said is elevated to the evaluation level of the two different translation products as the effect of different percentage composition is considered. The percentage composition worked out from the limited amount of randomly selected samples is used to show the general effect, the biggest difference between TTa and TTb is that translator A prefers c while B b. The two translation strategies have quite different effect. Now if we are to evaluate the work done by translator A, first we are going to see if he reached the expected effect, and then if not, we are to suggest a better translation percentage composition. But it seems complex to define a reasonable and accurate percentage composition since the principles would envolve many variables. Hmm. My question still remains:
Can the quality of the translation of the 42 samples be used to say the translation result of all the words and word groups of the same kind throughout the book?

white camellia
05-03-2008, 01:26 PM
This isn't really telling you anything. What you could compare is the percentages of methods used between each translator (more of a normalized comparison)

ie. Translator A perfers method c 33% of the time compared with Translator B using method c only 10% of the time.
Yes, thanks, papayahed.