AI Meetup Paris: Boosting NLP Models Performance
Are you confused between Ponicode and CircleCI? It’s not you, it’s us. Ponicode was acquired by CircleCI as of March 2022. The content and material published prior to this date remains under Ponicode’s name. When in doubt: Ponicode = CircleCI.
Hamza Sayah, Data scientist at Ponicode as the host of the 24th AI Meetup Paris welcomed for this debate our guests:
- Melissa Ailem, Research scientist Lingua Custodia
- Nicolas Meric, CEO of DreamQuark
- Ilya Prokin, Lead data scientist at Loansnap
This event dealt with the topic of NLP models performance, especially increasing performance thanks to data pre-processing and fine-tuning.
If you wish to learn more about AI Meetup Paris events, you can join the community to take part in future meetings 😊
When and why are we looking to improve NLP Models performance?
To Melissa’s opinion from Lingua Custodia, there are different scenarios for increasing models’ performances:
- Results don’t match the business or expectations goals.
- The model is to old and needs to be retrained with modern and current data.
- Data distribution has changed over time leading to updating the model, for instance in translation, the term covid has evolved since the beginning of the pandemic.
- A similar model getting better results also motivates one to improve the performance of its own model.
According to Nicolas, new vocabulary and context implies a need for improvement in the model in order to enrich entity recognition. When one has a model dedicated to a task such as natural language generation for example and now wants it to generate whole sentences, so when the model needs to go one step further in a similar task.
To Ilya from LoanSnap, it also goes beyond the functional aspect. There is another thought to be taken into account, one might want to increase its model performance in order to reduce computational time and infrastructure costs.
One way to increase performance is to work on data preprocessing. When should we consider this step?
Of course we should always be working on the data and stay vigilant to reach the highest performance possible model according to our speakers. It is also recommended to regularly consult highly regarded papers from Google or Facebook for instance, in order to find new ways to pre-process the data.
Before getting into data pre-processing, one needs a deep understanding of the task the model has to achieve. For instance, if the model is asked to generate some text, one needs clean text upfront. Relying on domain knowledge and human intelligence to understand how they perform the task is key to pre-process data in the right way. The privacy issue should not be overlooked as customers might be reluctant to provide data, that would require anonymisation, taking out names and other identifying information and making sure that the model can’t capture this kind of information.
Data preprocessing is important to increase NLP models’ performance but also to facilitate deployment and training.
Let’s now talk of fine-tuning. How would you define that process?
This step aims at using a pre-trained model in order to initialise another model to solve related but different problems. It can also be the case for a model that is trained in a generic field and one wants to tailor it to adapt to a specific area. In addition, this means adjusting the model’s parameters to transfer learning from one sector to another.
There are different use cases in which you might want to fine-tune your model:
- You are starting a new task and you know there isn’t a lot of data. You can use a more generic model and fine-tune it to answer your needs.
- When you want a generic model to be aware of some vocabulary and context specificities.
- When you wish to efficiently integrate the model in a larger pipeline.
- Fine-tuning is also a good option to save on computational time and resources required by the model.
What are the tools you are using to increase the performance of models?
Different kinds of tools can help boosting models’ performance in various ways:
- Saving development time such as the Python toolkit to implement sequence to sequence models such as Transformers.
- Handling complex infrastructures like Hugging Face which is a considerable time saviour thanks to their open source pre-trained models.
- Reducing costs and resources of the models, for instance Pytorch for the fine-tuning part.