Weekly log: 20120115
Sometimes, I really wonder if people (or agencies) collecting translation memories realize they might be collecting junk. I joined a number of segments this week. Now if you think that I should have tweaked my segmentation rules to avoid these joins I can tell you that you are asking for the impossible. Yes, perhaps a couple of my joins could be averted if I took the time to tweak my segmentation, but a lot of joins simply result from a difference in sentence structure and punctuation rules. In spite of how English word order has been invading the Chinese language, at a fundamental level these two languages have different word orders. That said, most of the time, if you need to more-or-less preserve the sentence structure (for example in bilingual subtitles) this still can be done, but there are still instances where the sentence structures of the two languages are so different that you just have to either join segments or, failing that, lie to your CAT. Chinese and English are not even really that different. Imagine how unreliable translation memories are for language pairs that are even more dissimilar. I remember seeing a question on a translator’s forum asking about precisely such a situation. Because of the way the English source text was worded (a bulleted list embedded in a paragraph), it was not even possible to join the segments. So at the end, the consensus was that the translator just had to lie to his CAT: Just translate the whole thing without the aid of the CAT tool, and then plug random parts of the translation, in order of course, back into the CAT. Obviously, it would be a miracle if the resulting translation memory could be trusted. It thus amazes me that some agencies actually trust their translation memories so much that they demand “discounts” for TM matches. The issue of context aside, should they not at least realize — if they truly are professional — that translation memories just cannot be blindly trusted?