New Artificial Intelligence Approach Called PromptPG Learns to Select in-Context Examples From Small Amount of Training Data via Policy Gradient When Interacting With GPT-3 API


The field of Natural Language Processing is constantly evolving, leading to the development of intelligent systems with an ever-increasing understanding of language. With the introduction of Large Language Models (LLMs) such as ChatGPT, PaLM, and DALL-E, we have seen exponential growth in their performance, allowing them to imitate humans and perform tasks such as generating content, summarizing long pieces of text, answering questions, and completing codes. But while they have achieved great success in almost every domain, there is still one area where they have been lacking: tabular mathematical data.

This is why researchers from the University of California, Los Angeles, Georgia Institute of Technology and Allen Institute for AI have developed a new approach called PromptPG. In this blog post, we will be taking a closer look at how PromptPG works, its implications on LLMs, and how it can solve complex mathematical problems that require reasoning. So if you are interested in finding out more about this cutting-edge solution, then read on!

What is PromptPG?

PromptPG is a new approach for dealing with tabular and textual data consisting of grade-level mathematical reasoning problems. It is based on Policy Gradient, an approach to solving reinforcement learning problems. This involves three steps: sampling the actions, observing rewards, and tweaking the Policy.

PromptPG uses the concept of policy gradient in a way that the in-context examples are chosen from the training data, followed by the development of prompts for the testing data. It does so while it deals with the GPT-3 interface.

For training the model, the researchers behind PromptPG have developed a new Tabular Math Word Problems (TabMWP) dataset consisting of 38,431 open-domain textual and tabular-type mathematical reasoning problems. Out of the total data in the dataset, questions are 28,876 in number, answers are 6,153 in number, and there are 35,442 different solutions.

How Does PromptPG Work?

The PromptPG interface is very user-friendly and easy to use. It has simple filters to choose from. The user can choose between the type of question he wishes to find a solution to, be it free text or a multiple-choice type. After that, an answer can be selected out of the many options of integer, Boolean, decimal, etc. The user can also specify the grade, the number of rows and columns, and the table title.

The researchers have shown that when using PromptPG on the TabMWP dataset, an average accuracy of 68.23% (State of the art) has been achieved with a 5.31% gain over random selection. Several pre-trained models have been evaluated on the TABMWP dataset, such as the GPT-3 model, which previously performed poorly because of its dependency on the in-context example selection. PromptPG, while selecting the in-context examples, decreases the variance, followed by a growth in the efficiency and performance of the model without any heuristics.


PromptPG is a great advancement considering the current LLMs’ limitations in solving complex mathematical problems requiring reasoning. This approach can boost the performance of the GPT model and is undoubtedly a cutting-edge solution.

If you would like to find out more about PromptPG, check out the Paper, Github, and Project Page. All credit for this research goes to the researchers on this project. Also, don’t forget to join our 14k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Leave a comment

Your email address will not be published. Required fields are marked *