Localization Metrics 201: Moving Beyond the Basics for Optimization

Localization Metrics 201

In my last post, Localization Metrics 101: A Crash Course in the Basics, I proposed that tracking basic indicators like on-time deliveries, average language quality scores, and throughput were vital to establishing a solid localization program. Once you have gotten these basics down — especially narrowing your focus to the metrics that are important to your own unique strategy — you can turn your attention to Level 200 issues.

Let’s talk about those now.

Quantified Quality Tiers

I was once visited by a vacuum cleaner salesman in my home. He first vacuumed my living room floor with my own vacuum cleaner. Then, to show how much better his product was, he vacuumed the floor to show what mine had not picked up. 

Of course, if he had used my vacuum cleaner after his with a pristine new filter like his, he would have very likely achieved nearly the same result. So while what he wanted to illustrate was the better performance of his vacuum, he was not doing anything of the sort.

I share this to underscore two points: First, just as a QA step (the vaccum) in your process will find defects (dirt), if you run two QA steps the second one will inevitably find things the first missed. That doesn’t mean that the first pass necessarily failed unless you somehow believe a test can actually find 100% of all defects.

Second, both vacuums collected some of my carpet as well (false positives). A QA step in translation can report false positives, and there is a chance that the review can add errors. Reviewers are human and natural language can be ambiguous and complex. Hence, how you measure quality will inevitably affect how you define quality. In fact, how you measure quality will effectively define what “quality” means for your program. 

In translation and localization solutions we often see this happen. If you use multiple localization teams — some for the translation and some for the quality assurance — you may find that any two teams will not necessarily measure quality the same way. The test team is incented to report errors, not necessarily report “quality.” Without benchmarking the review process itself, simply having a report of defects will not tell you what percent of the actual errors in the file the reviewer found (false negatives), how many false errors were reported (false positives) and how many new errors are actually added by the reviewer.

This is not to suggest that there is no role for independent quality assurance. Independence is the most important factor — it certainly cannot be considered valid if a vendor with a vested interest in disputing the work of a rival company, but the test itself should be clearly defined and benchmarked. What is the test designed to detect? What is the definition of an error? More advanced programs will consider concepts such as sensitivity, tolerance, saturation, and specificity.

But it’s better to start with the basics: Specifically, the standard definition of what you want to deliver and, conversely, what you consider an error. Consider: 

  • Are you using language style guides and glossaries to best equip your localization teams (i.e., achieving quality control versus quality assurance)? Are the translators and reviewers using the same instructions? 

  • Have you identified what the quality certification process looks like, in terms of acceptance criteria?

  • And are you committing an appropriate level of time and labor to the quality review process so that you can take those relevant and useful results into subsequent projects?

  • Is your QA step part of your critical path (e.g. the test must be passed before delivery) or is it post-delivery (only for scoring)? 

If you are not doing things like these to first manage your quality tiers, you are not then capable of tracking the KPIs of a Level 200 localization program.

Information Orchestration [eBook]

Theory of Constraints

Eliyahu M. Goldratt introduced the concept of the theory of constraints (TOC) back in 1984. The idea is one that most can readily understand today. If you take a factory, regardless of the industry, you can say that each station in that factory has an input and an output. All of those stations together are the production line, so a bottleneck — a breakdown of an input or an output at one of the stations — will carry a problem downstream to the next.

You can readily see this in the translation and localization context. Imagine that your American English translation team is unavailable for Fourth of July festivities. Or that your Chinese desktop publishing team has hit a snag in its post-production workflow. Some of these issues will be predictable and others not, but the Level 200 localization program has resources in place to track and measure throughput to give predictability to output and to show when scale or alternatives are needed.

In this way, your program

  • Understands and assigns priorities to reduce the likelihood of unmanageable bottlenecks

  • Ensures that human resources are available at the right time and with the right tools

  • Identifies overall capacity needs and external factors that may influence or shape capacity

Fully Loaded Word Pricing

Ensuring the efficiency of your localization system also has to take in the real price of the total effort. This is not just about the unit rates of  translation teams. Include project managers, engineering, quality control and review — all of it — and divide that resulting figure by the total number of words delivered. As a rule of thumb that number should usually be below your new word rate to be an efficient system. Keep in mind new projects increase the fully-loaded unit rate, so leverage from previous projects plays a major factor.

I’ve seen large projects move through optimized systems that were one-tenth of the new word rate although that took almost a decade to achieve. Using this figure as a metric helps understand how you are leveraging your resources … or it is an indication of where gains can be made through smart content development, automation, translation memory, or production optimization.

 

I say more about metrics, KPIs, dashboards and more in my webinar. See below for the link to the recording. Add your questions in the comments section below.

Localization Metrics and KPIs Explained [Webinar]