In the field of artificial intelligence, few innovations have been as disruptive as Large Language Models (LLMs), renowned for their ability to understand and generate human-like text. These models, powered by cutting-edge technologies, are reshaping industries and the way we interact with technology. While the Generative Pre-trained Transformer (GPT) series, developed by OpenAI, stands as a testament to LLMs’ potential, it is the accessibility and versatility of platforms like Hugging Face that are truly driving this revolution.
What is Hugging Face’s Model Hub?
Traditionally, tech giants operated within proprietary frameworks, guarding their advanced AI models closely. However, a growing realization of the benefits of open-source development has led companies to release some of their most sophisticated models to the wider community. Meta, Microsoft, and others have embraced this new era of openness, contributing pre-trained models to Hugging Face’s Model Hub.
Hugging Face’s Model Hub serves as a central repository for pre-trained AI models, datasets, and resources. This open platform is democratizing AI by enabling developers, researchers, and businesses to access, fine-tune, and deploy these models for various applications. The involvement of industry giants in open-sourcing their models on this platform adds a new layer of significance to its ecosystem.
The integration of models from major tech companies into Hugging Face’s Model Hub brings collaborative innovation. Developers and researchers from around the world can now build starting from these models, creating novel applications, improving performance, and tailoring solutions to specific needs. This level of collaboration has the potential to accelerate AI advancements at an unprecedented pace, while simultaneously leveling the playing field for smaller players who may not have had access to such resources before.
One of the most exciting aspects is integrating LLMs into the production pipelines. There are several ways to achieve it but in this article, we want to present the solution powered by Radicalbit’s MLOps platform. Our Platform is able to manage the whole life-cycle of a model in production, starting from the processing of the data before the inference, going through the model serving, until the performance monitoring. One of the latest features is the integration of the Hugging Face Model’s Hub into our MLOps Platform so that the deployment of LLMs can be achieved with no code and a few clicks.
How to run a Hugging Face Model in Radicalbit’s MLOps Platform
That’s what you need to run a Hugging Face Model:
- An active platform subscription (if you don’t have one or you want to start a free trial, click here);
- 2 minutes on your hands.
First of all, you need to go to the MLOps section of the platform and create a New Model, choosing a Name and providing a Description.
Our platform gives you two options for uploading models: MLflow and Hugging Face. We’re going to roll with the Hugging Face option (we will write a dedicated blog about MLFlow deployment).
Our MLOps Platform has a direct line to the Hugging Face APIs. To get your model, choose the Task, the Model Repository name and les jeux sont faits. For this demonstration, we will import a text generation model called bert-base-uncased a variant of BERT (Bidirectional Encoder Representations from Transformers) developed by Google (here’s the paper https://arxiv.org/abs/1810.04805), able to fill the masked words in a sentence with the most probable ones.
All the previous steps have taken almost 30 seconds of your time, and the remaining 90 will be involved to deploy the model and make it available for inference. Press the Serve button and enjoy your Hugging Face model!
Just some considerations before concluding.
There are plenty of tools and solutions to deploy and Hugging Face model, so why should you choose our MLOps Platform?
Radicalbit’s platform provides an entire ecosystem of features and perks for your machine-learning models, such as versioning, performance monitoring in production, drift and data integrity detection and much more. In addition to that, it exploits the power of Pipelines to pre-process and post-process your data just before and after the inference, an essential step to make rough data ready for the model and the predictions suitable for your custom use cases.
One last mention must be made and it concerns Applications, through which it is possible to build and expose a service containing pre/post processing and inference all in one shot, accessible via an HTTP call as a single, unique service.
Do you want to know more about our MLOps Platform? Visit our website and book your free demonstration!
Explore how the Feedback API can enhance ML model performance over time with our MLOps Platform. Dive into the benefits and applications of integrating feedback mechanisms to optimize your machine learning models dynamically.
Radicalbit joins World AI Cannes Festival 2024 as a sponsor and speaker! We’ll wait for you from February 8th to 10th in Cannes
On November 21st – 24th we had the honour of presenting a live talk at the Big Data Conference Europe 2023 in Vilnius, Lithuania. Let’s see how the event went and the take-home messages