Text style transfer in a spreadsheet using Hugging Face Inference Endpoints

SetFit

We change our conversational style from informal to formal speech. We often do this without thinking when talking to our friends compared to addressing a judge. Computers now have this capability! I use textual style transfer in this post to convert informal text to formal text. To make this easy to use, we do it in a spreadsheet.

The first step is identifying an informal to formal text style model. Next, we deploy the model using Hugging Face Inference endpoints. Inference endpoints is a production-grade solution for model deployment.

Letโ€™s incorporate the endpoint into Google Sheets custom function to make the model easy to use.

I added the code to Google Sheets through the Apps Script extension. Grab it here as a gist. Once that is saved, you can use the new function as a formula. Now, I can use one simple command if I want to do textual style transfer!

Alt Text

I created a Youtube ๐ŸŽฅ video for a more detailed walkthrough.

Go try this out with your favorite model! For another example, check out the positive style textual model in a Tik Tok video.

Few shot text classification with SetFit

SetFit

Data scientists often do not have large amounts of labeled data. This issue is even graver when dealing with problems with tens or hundreds of classes. The reality is very few text classification problems get to the point where adding more labeled data isnโ€™t improving performance.

SetFit offers a few-shot learning approach for text classification. The paperโ€™s results show across many datasets, itโ€™s possible to get better performance with less labeled data. This technique uses contrastive learning to build a larger dataset for fine-tuning a text classification model. This approach was new to me and was why I did a video explaining how contrastive learning helps with text classification.

I have created a Colab ๐Ÿ““ companion notebook at https://bit.ly/raj_setfit, and the Youtube ๐ŸŽฅ video that provides a detailed explanation. I walk through a simple churn example to give the intuition behind SetFit. The notebook trains the CR (customer review dataset) highlighted in the SetFit paper.

The SetFit github contains the code, and a great deep dive for text classification is found on Philippโ€™s blog. For those looking to productionize a SetFit model, Philipp has also documented how to create the Hugging Face endpoint for a SetFit model.

So grab your favorite text classification dataset and give it a try!

Getting predictions intervals with conformal inference

Conformal

Data scientists often overstate the certainty of their predictions. I have had engineers laugh at my point predictions and point out several types of errors in my model that create uncertainty. Prediction intervals are an excellent counterbalance for communicating the uncertainty of predictions.

Conformal inference offers a model agnostic technique for prediction intervals. Itโ€™s well known within statistics but not as well established in machine learning. This post focuses on a straightforward conformal inference technique, but there are more sophisticated techniques that provide more adaptable prediction intervals.

I have created a Colab ๐Ÿ““ companion notebook at https://bit.ly/raj_conf, and the Youtube ๐ŸŽฅ video that provides a detailed explanation. This explanation is a toy example to learn how conformal inference works. Typical applications will use a more sophisticated methodology along with implementations found within the resources below.

For python folks, a great package to start using conformal inference is MAPIE - Model Agnostic Prediction Interval Estimator. It works for tabular and time series problems.

Further Resources:

Quick intro to conformal prediction using MAPIE in medium

A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification, paper link

Awesome Conformal Prediction (lots of resources)

Explaining predictions from ๐Ÿค— transformer models

Copy of @rajistics (3)

This post covers 3 easy-to-use ๐Ÿ“ฆ packages to get started. You can also check out the Colab ๐Ÿ““ companion notebook at https://bit.ly/raj_explain and the Youtube ๐ŸŽฅ video for a deeper treatment.

Explanations are useful for explaining predictions. In the case of text, they highlight how the text influenced the prediction. They are helpful for ๐Ÿฉบ diagnosing model issues, ๐Ÿ‘€ showing stakeholders understand how a model is working, and ๐Ÿง‘โ€โš–๏ธ meeting regulatory requirements. Here is an explanation ๐Ÿ‘‡ using shap. For more on explanations, check out the explanations in machine learning video.

Screen Shot 2022-08-12 at 9.25.07 AM

Letโ€™s review 3 packages you can use to get explanations. All of these work with transformers, provide visualizations, and only require a few lines of code.

Red and Purple Real Estate Soft Gradients Twitter Ad (1)

  1. SHAP is a well-known, well-regarded, and robust package for explanations. In working with text, SHAP typically defers to using a Partition Shap explainer. This method makes the shap computation tractable by using hierarchical clustering and Owens values. The image here shows the clustering for a simple phrase. If you want to learn more about Shapley values, I have a video on shapley values and a deep dive on Partition Shap explainer is here.

Screen Shot 2022-08-12 at 9.35.34 AM

  1. Transformers Interpret uses Integrated Gradients from Captum to calculate the explanations. This approach is ๐Ÿ‡ quicker than shap! Check out this space to see a demo.

Screen Shot 2022-08-12 at 9.27.04 AM

  1. Ferret is built for benchmarking interpretability techniques and includes multiple explanation methodologies (including Partition Shap and Integrated Gradients). A spaces demo for ferret is here along with a paper that explains the various metrics incorporated in ferret.

    You can see below how explanations can differ when using different explanation methods. A great reminder that explanations for text are complicated and need to be appropriately caveated.

    Screen Shot 2022-08-11 at 1.19.05 PM

    Ready to dive in? ๐ŸŸข

    For a longer walkthrough of all the ๐Ÿ“ฆ packages with code snippets, web-based demos, and links to documentation/papers, check out:

    ๐Ÿ‘‰ Colab notebook: https://bit.ly/raj_explain

    ๐ŸŽฅ https://youtu.be/j6WbCS0GLuY

Dynamic Adversarial Data Collection

Are you looking for better training data for your models? Let me tell you about dynamic adversarial data collection!

img

I had a large enterprise customer asking me to incorporate this workflow into a Hugging Face private hub demo. Here are some resources I found useful: Chris Emezue put together a blog post: โ€œHow to train your model dynamically using adversarial dataโ€ and a real-life example using MNIST using Spaces.

If you want an academic paper that details this process, check out: Analyzing Dynamic Adversarial Training Data in the Limit. By using this approach, this paper found models made 26% fewer errors on the expert-curated test set.

And if you prefer a videoโ€Šโ€”โ€Šcheck out my Tik Tok:

https://www.tiktok.com/@rajistics/video/7123667796453592366?is_from_webapp=1&sender_device=pc&web_id=7106277315414181422