Toward multi-lingual posology structuration
At Posos, our primary objective is to reduce the amount of time physicians spend on non-care tasks so, they can devote more time to patients and provide care that is tailored to each patient's unique needs. To that prospect, we have already developed a hybrid dosage structuration system using a deterministic algorithm and a deep learning approach. However, here at Posos, we are committed to keep on pushing boundaries even further. In this article, we will expand on our new challenge: developing a multi-lingual approach for other languages. Starting again from scratch for each target language is neither the fastest nor the smartest approach. Instead, we will delve into different approaches: leveraging translation models, relying on cross-lingual transfer, or fine-tuning Large Language Models (LLMs).
Summary
This article is part of a series explaining the ongoing work of Posos' Research and Development department. You may find an article diving into the inner workings of an LLM here, and an article on how we are tackling automatic data annotations for posology structuration is currently in preparation.
Translation-based methods
In the long run, we aim to be able to structure any posology in any language anywhere in the world.
Classical approaches would mean to re-annotate a dataset and retrain a new model from scratch for every new country and every new language. Our main hypothesis is that any two given languages share, to some extent, similarities in structuring medical terminology: one of the main subjects of our research has been to figure if the knowledge and the capabilities of our already existing posology structuration system could be transferred to another language.
For clarity purposes, we shall focus in this article solely on the example of transferring from French to English for the posology structuration task. The remainder of this article first tackles the implementation of a Named Entity Recognition (NER) - one of the algorithms we currently use at Posos for posology structuration in French - engine in a given target language that lacks annotations, before considering replacing it with a Large Language Model (LLM) based approach.
The translate-train method

The most straightforward approach would be to rely on translation methods. The simplest one, named translate-train, as illustrated above, is to use state-of-the-art translation models to automatically translate the training dataset with its annotations, and train a new Named Entity Recognition (NER) algorithm on the fresh dataset in the target language. However, for those annotations to withhold the same information, and to ensure nothing has been lost in translation, these must be realigned. A word alignment algorithm matches the words of a sentence with those of its translation. That way, translated annotation labels can be automatically tweaked and adapted to better describe the training data and ensure no information has been lost.
The translate-test method

Another approach would be to keep the same NER engine and translating only the data to be structured from the target language into the reference language at the start of each structuration process: this method, named translate-test, as illustrated above, is marginally slower during the structuration (around 0.1 seconds longer) but presents the phenomenal advantage over the translate-train method of not having to retrain a new NER. The labels detected with the NER model in the reference language can be matched with their counterparts in the target language via an alignment algorithm.
Unfortunately, extensive testing conducted by Posos R&D department has demonstrated the translate-test method to be consistently and significantly outperformed by the translate-train method. This was to be expected as intrinsic translations and alignment errors are diluted in the vast training dataset with the translate-train method whereas the same errors directly impact the output of the NER system with the translate-test method. The translate-train approach yields satisfying results, we however explored the possibility of using pre-trained multilingual models.
Multilingual approaches
They are created upon a revolutionary idea: instead of training an AI for only one task on tens of thousands of examples annotated by hand, it requires less human effort to use an already pre-trained AI. Its training on a large variety of data will give it the ability to generalize knowledge on a high level. This creates a vast representation of information (a language, pictures, DNA, it could be anything) and a capacity of generalization. If you want to learn more about Large Language Models and Generative AI, you may find more over here [insert link].
Multi-lingual models

The Multilingual Language Model (MLM) goes one step further: by aggregating data in different languages, the pre-training phase will "learn" these languages and acquire potential translation abilities. It is worth noting that it can be performed either in a supervised fashion - training data has to be realigned via a word alignment model - or in an unsupervised fashion as LLMs and all the more MLLMs have the ability to generalize from their training data. Thus can it be used to transfer knowledge from one language to another?
Two different paths arise. At Posos, we carefully handpicked an MLM model which was fine-tuned on an English dataset to perform the classification task and evaluated on target languages. We demonstrated this approach with a model large enough to yield the best results. It should therefore be favored all the more that it constitutes a ready-made solution in contrast to translation-based solutions that require more manpower as it relies on carefully selecting appropriate translations and alignement models. On the other hand, the translation-based approach has the fundamental advantage of requiring a smaller model, which is something to keep in mind as we aim to be able to deploy a solution that would be as fast and as efficient as possible.
Large Language models (LLMs)
The main focus was so far on classification models. However, it should be possible to rely on the supposedly broader generalization abilities of larger generative LLMs. This ability can be leveraged in a question answering setting - interactions with the models are performed via a chat - through zero-shot prompting, i.e., when the prompt does not contain any examples nor new information than the information present in the datasets for pre-training or fine-tuning. When it does not work, a solution would be to turn to few shots prompting, when a little bit of information is given to the model. Our research team is currently experimenting with several larger LLMs with the aim to provide a more complex structuration than NER in multiple languages.
How to choose between these methods? We still have some experimentations to perform but each of these approaches has yielded promising results. The translate-train approach competes in terms of performances with Cross Lingual Transfer via an MLM even if the first one is labour intensive while the former is a ready-made solution. We will also have to keep in mind that using a LLM might a be a little slow for real life applications. It is still a work in progress, but odds are that the final answer will be a carefully crafted combination of these approaches. Here at Posos, our R&D has exciting prospects!
(Hu et al. 2023) : https://aclanthology.org/2023.acl-long.230/
(Hu et al. 2024) : https://arxiv.org/abs/2402.01676
(Wei et al. 2022) : https://arxiv.org/pdf/2109.01652.pdf
(Wu and Dredze et al. 2019) : https://arxiv.org/abs/1904.09077
(Gashi, Fontaine et al. 2023) : https://hal.science/hal-04193182/