A Conversation with Hugging Face and AWS

A Q&A with Jeff Boudier, Hugging Face's Head of product/growth, and Bratin Saha, VP of machine learning at Amazon AI

March 24, 2021

• 7 min read

Yesterday, Hugging Face, the open-source natural language processing startup, selected Amazon Web Services (AWS) as its “preferred cloud provider.” AWS users will be able to quickly train and deploy models on Amazon’s cloud machine-learning platform.

Hugging Face wants to meet more users where they naturally spend their time (AWS). Plus, Hugging Face’s language models are getting hangrier and hangrier for storage and compute power. AWS, on the other hand, can rent cutting-edge AI language models to large customers and turnkey services to smaller users.

To get more detail on how and why the partnership came together, we spoke with Jeff Boudier, who leads product and growth at Hugging Face, and Bratin Saha, VP of Machine Learning at Amazon AI. Read on for the full conversation.

NB: This conversation has been edited for brevity and clarity.

How did this partnership come to be?

Jeff Boudier: At Hugging Face, our mission is to democratize machine learning. We’ve built an open-source community of researchers, data scientists, and machine learning engineers that are all working together to bring forward the latest state-of-the-art science in an accessible way. Transformers became the reference library and toolkit for the machine learning community to do their work, with the goal of reducing the time it takes for the latest science to make it into production.

When we talked with customers, one of the main things they wanted to do is use our latest libraries and technology to do machine learning work within Amazon SageMaker. [Editor’s note: SageMaker is Amazon’s cloud machine-learning platform.] It’s really exciting to make all of our models and technology really easy to use and leverage within SageMaker.

Bratin Saha: We share this mission of democratizing machine learning for all our customers, and Hugging Face is really the leader in NLP. It seemed like a very natural partnership that Hugging Face would use AWS as their preferred cloud provider.

The vast majority of machine learning in the cloud today happens on AWS. In fact, SageMaker is one of the fastest growing services in AWS history. We’ve built a lot of tooling that is purpose-built for machine learning training.

That’s a helpful starting point. I don't know when these conversations started, but Jeff, did you know that if Hugging Face kept growing, it would need to strike a partnership with a big public cloud provider?

JB: Everything we do is community-driven, so we've found ourselves at the core of the whole ecosystem around NLP, and now, machine learning. From chip to cloud, we’re able to see and help companies. Our customers were already using our solutions within cloud environments; AWS and Amazon SageMaker were clearly their preferred tools.

There were clearly opportunities to collaborate and make that work easier, more seamless, and optimized for better performance. With Amazon, we’ve made training new state-of-the-art models as easy as a few lines of code.

Quick off-hand note: It sounds like you’re both quote-unquote customer-obsessed. Bratin, anything to add?

BS: It was natural for us to partner with Hugging Face, because they have the most popular NLP models. These models benefit from being in the cloud because they often have significant compute and storage requirements.

That’s actually a perfect segue. The next thing I wanted to ask about are the trendlines in NLP, in terms of storage and compute.

JB: The exponential growth of model sizes is driving the incredible, increasing performance of NLP tasks, whether it's generating text, summarizing an article, or classifying text. Over the last couple years, these tasks have increased really quickly in terms of performance and benchmarks, but it’s happened through this exponential increase of model size.

That creates lots of challenges for ML engineers all over the world. We are very keen on mitigating this, to guarantee our mission of accessibility. We’ve introduced many new techniques to our open-science contributions, starting with model distillation, making BERT 40% smaller and 60% faster.

BS: Two years back, when ResNet was the model that people were talking about, there were maybe 20 million parameters. In 2019, we moved to hundreds of millions of parameters. In 2020, we saw a model with three billion parameters. Now, we are moving towards more. The amount of compute being thrown at training these models has doubled every few months.

Machine learning is really built on a foundation of compute and storage and security and other capabilities. We now have Inferentia, custom silicon on AWS, that provides more than a 40% improvement in throughput and dollars per inference. Later this year, we have Trainium coming.

As Jeff was saying, the trend in machine learning is significant growth in the amount of compute and storage you need. The cloud is the best place to do this, because you can get services on-demand, use them for the amount of time, and not make that huge upfront capital expenditure.

Recapping Stanford’s AI index report, I recently wrote that NLP is improving so fast, it’s outpacing the measures we assess performance. What are your thoughts on that?

BS: First, there was BERT, and then lots of cousins: ALBERT, RoBERTa, and many others. Then, you know, you have the GPT family of models.

JB: When BERT was first introduced in 2018, six months after overcoming the SQuAD 1.1 question/answering benchmark, Stanford had to come up with SQuAD 2.0, so we had something to weigh the models against. Within three or four months of SQuAD 2.0 being introduced, it was again overcome in terms of its performance related to the human baseline. We've seen that trend for a couple of years already. How to benchmark models is a very interesting and active topic within the scientific community.

I’m sure both of you have a lot of visibility into the most popular use cases for NLP right now. Can you extrapolate a bit and say what you see the dominant use cases of tomorrow being? Or will they not change?

JB: We see industries that are very mature in their use of NLP, which have transitioned to modern NLP. The transition is accelerating, especially in the industries of financial services, healthcare, and consumer technology.

Our vision is that every single company in the world will benefit from NLP, because every single company in the world takes a text as input. You can take whole business departments and describe them as NLP functions, right?

In terms of where the puck is going, I think the main contribution for NLP is transfer learning, which is from the Transformer-based architecture. We’ve seen either great or promising results in separate modalities from NLP. We’ve started introducing new models and features around speech recently. In industry, we see promising results around vision and multi-modal domains.

BS: Broadly, we think that in domains like document processing, this is going to be important. Every company has reams of X, whether it’s financial documents or legal documents. You can automate a lot of that with use cases like expense processing, loan processing, mortgage processing, and so on. Retraining and then fine-tuning with some domain-specific data makes the whole paradigm a lot more accessible, and reduces the amount of data you need to use.

Any closing thoughts?

BS: Machine learning is the here and the now, not the future. It’s happening now. We have tens of thousands of customers now transforming the business through machine learning. It's really important for people to get started.

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.