•

Alexandre Borghi

Created February 25, 2024 © Apache-2.0

Mather, your LLM math tutor trained on AMD MI210

Mather is a Mistral fine-tune designed to help students with their math problems.

IntermediateFull instructions provided1 hour376

Generative AI - MI210: 1st Place

Pervasive AI Developer Contest

Mather, your LLM math tutor trained on AMD MI210

Things used in this project

Hardware components

AMD Instinct™ MI210 accelerators

Software apps and online services

AMD ROCm™ Software

Mistral-7B-Instruct-v0.3

7B instruct LLM developed by Mistral AI.

Hugging Face Transformers

APIs and tools to easily download and train state-of-the-art pretrained models.

Streamlit

Open-source Python framework to create interactive web apps.

LM Studio

Interact with local LLMs via a local API server.

llama.cpp

LLM inference in C/C++ and integer quantization.

Story

Introduction

Access to educative content can be limited in certain regions of the world and parents can feel overwhelmed when helping their children, especially in the field of mathematics. Private tutors are not always available or can be too expensive.

We developed Mather, a Large Language Model (LLM) that serves as a mathematics tutor. Users can ask Mather questions and it will provide individualized answers and guidance.

Mather is a fine-tune of Mistral-7B-Instruct-v0.3, trained on several mathematics datasets (see below). It has been trained on 8x AMD MI210 on an AMD Accelerator Cloud node leveraging ROCm 6.1.1. The model can be directly used locally, without an internet connection, via a dedicated dialog user interface.

Example

Example of dialog with Mather-v1

Code and Model

Our code is hosted on GitHub: https://github.com/AmandineFlachs/Mather
It includes our scripts to train, deploy and interact with the model.
Our model Mather-v1 is hosted on Hugging Face: https://huggingface.co/AmandineFlachs/Mather-v1-gguf

How to use Mather locally

Install LM Studio and Streamlit.
Run our Mather-v1 LLM locally with LM Studio, like any other model hosted on Hugging Face.

Mather-v1 in ML Studio

Clone our repository on GitHub and go to the deploy folder.
Create a virtual environment and install our Python dependencies:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Run our dialog interface for Mather-v1, which relies on Streamlit:

streamlit run streamlit_app.py

This will open a new tab in your default browser, which should look like the screenshot below. You can easily interact with the model we trained. On the left panel you can specify whether you prefer concise or detailed answers.

UI to interact with Mather-v1

Also, note that our implementation is easy to customize to allow the community to modify it at their will.

How we trained our model

Using Hugging Face Transformers we fine-tuned Mistral-7B-Instruct-v0.3 for 1 epoch on 3 datasets: MathInstruct, MetaMathQA and GSM8K (train split only, following best practices in LLM training). Training took 5 hours on 8x AMD MI210 64GB.

To replicate our model, please install dependencies following the instructions provided here by AMD. In the train folder, there are 2 main scripts to generate a model: finetune.py trains a low-rank adapter (LoRA), which then can be merged with the original model using merge.py. Follow these instructions to use the 2 scripts:

To fine-tune the model on 1x AMD MI210:

python3 finetune.py

To fine-tune the model on 8x AMD MI210:

OMP_NUM_THREADS=8 python3 -m torch.distributed.launch --nproc_per_node=8 finetune.py

You can use rocm-smi to check that you're using the correct number of MI210 GPUs:

rocm-smi

To merge the LoRA adapter that the fine-tuning script generated with the base model, run:

python3 merge.py

Additionally, the tokenizer configuration files from Mistral-7B-Instruct-v0.3 need to be copied to the merged model folder.
Optionally, to deploy the model you can then generate a quantized GGUF model for efficient inference by installing llama.cpp and running:

python llama.cpp/convert_hf_to_gguf.py --outfile mather-v1.gguf --outtype q8_0

Limitations

Limitations of the original model Mather-v1 has been fine-tuned from still apply: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3
Like all LLMs, Mather can make mistakes or answer a slightly different question than the one asked. The user can mention that there is an error, which gives Mather the opportunity to correct its answer.
Mather tends to repeat answers in slightly different forms, such as:

Number of pears = 2
The answer is 2.
#### 2
The answer is: 2

Mather's main language is English but it also supports other languages such as French. We've tested it and the overall quality is higher in English. Moreover, answers in different languages sometimes include sentences in English.

Comparison with original Mistral model

The original Mistral model is often more verbose than Mather. It also tends to make more mistakes than Mather on simple problems and shows a higher variance in results. You can see the different answers depending on the model to one of our evaluation questions:

Cecilia has 5 apples and 8 pears. She gives half of her pears to Thomas. Then she gives half of her remaining pears to Lea. How many pears does Cecilia have in the end? Explain your reasoning and provide a formula.

Mather-v1:

Example of answer generated by Mather-v1

Mistral-7B-Instruct-v0.3:

Example of answer generated by Mistral-7B-Instruct-v0.3

Future work

As previously mentioned, we've identified a certain number of limitations (see section Limitations above) that we would like to address in Mather-v2. Besides improving overall performances and these specific limitations, we would like to support more languages to reach an even larger audience.

Mather-v1 is a relatively small model (for LLM standards) and has been designed so that it does not require access to expensive hardware (and no internet connection after initial download) to reach a wide community. We would like to go further with an even smaller model, possibly via distillation or by selecting a different base model.