Team Creeper:

•

Published October 14, 2024 © MIT

PowerSheet: Powerful AI Synthesizer for Spreadsheets

PowerSheet aims to provide a new interactive approach for spreadsheet workflows, tailored for both novice and experienced users.

IntermediateFull instructions provided1 hour679

PC AI: 1st Place

Pervasive AI Developer Contest

Things used in this project

Hardware components

Minisforum Venus UM790 Pro with AMD Ryzen™ 9

Software apps and online services

AMD Ryzen™ AI Software

AMD ROCm™ Software

PyTorch

Microsoft Excel (version 2021 or newer)

Story

1. Abstract

PowerSheet aims to provide a new interactive approach for spreadsheet workflows, offering powerful AI tools tailored for both novice and experienced users. This solution enhances spreadsheet design and data processing while ensuring privacy through a locally deployed model running on Ryzen NPUs. PowerSheet features four core functionalities: data generation, comprehension, checking, and batch processing. Leveraging advanced language models, PowerSheet can:

Automatically deduce the intended formula from input-output examples or natural language descriptions. ⭐
Generate JavaScript code from user descriptions, which is used to highlight or apply intended operations directly to the cells, without requiring any programming expertise of the user. ⭐
Explain the meaning of the formulas of some cells, or provide a summary of the information presented in the entire document.
Assist users in creating the most suitable chart for the worksheet to help with data comprehension and presentation.

Built upon the advanced AMD Ryzen AI platform, PowerSheet integrates support for the latest generation of open-source large language models (LLM): Llama3-8B. Furthermore, PowerSheet employs state-of-the-art LLM inference techniques, optimizing performance and efficiency. These advancements bring significant benefits, including faster processing times, enhanced accuracy, and an improved user experience.

2. Background

Spreadsheet applications, like Microsoft Excel or LibreOffice Calc, are widely used by people from all walks of life to organize and analysis data. Benefit from modern graphical user interfaces (GUIs), most individuals can handle basic spreadsheet operations by their experience and interface prompts.

Two programming models are usually provided by most of the spreadsheet applications, including equation-based model for cells and script-based model for procedural logic. Equations provide users with a set of predefined high-level functions (such as SUM, IF, VLOOKUP, FIND, etc.), allowing concise expression of data manipulations. On the other hand, script interface like Office Visual Basic for Applications (VBA), offers a Turing-complete procedural programming paradigm typically tailored for advanced users.

However, for users unfamiliar with writing equations, conducting non-trivial operations, like "calculating the average age of users with the telephone area code (+025), " can be cumbersome. The GUI in Excel and LibreOffice Calc only provides a collection of functions for each category and some simple templates, thereby confusing users in implementing their equations unless they specifically consult the programming guides.

For users using spreadsheets applications, automating the process of writing data processing equations and scripts can boost their productivity. Ideally, users could describe tasks in natural language, supplemented by input-output examples (if available), then launch a synthesizer to generate equations and scripts. The synthesizer would be integrated with existing spreadsheet applications, providing real-time preview and iteratively refine programs according to user needs, assisting users in completing data processing tasks while ensuring data privacy.

3. PowerSheet Features

Figure shows the commands provided by the PowerSheet plugin. Commands are divided into four groups according to the following categories, making it easy for users to navigate to the functions they need.

Commands of PowerSheet

In the demo video, we provide a live demonstration of the AI Batch Processing, AI Generation and AI Comprehension features offered by PowerSheet. Our live demonstration is based on AMD's Minisforum UM790 Mini PC, and the video is not accelerated. You can see how the state-of-the-art LLM inference techniques employed by PowerSheet accelerates the inference process.

PowerSheet DEMO

3.1. Overview

3.1.1. AI Generation

To utilize PowerSheet's capabilities on content and formula generation, no additional knowledge is required for the users. They can simply use a PC equipped with Ryzen AI, fire up their familiar spreadsheet application, and work as usual. When they encounter scenarios requiring writing non-trivial formula or cell content, the only thing they need to do is click on the PowerSheet tab in the toolbar and follow these steps to complete the synthesis:

1. Select the cells to be analyzed (i.e., the data source, potentially spanning multiple rows and columns).

2. Select input-output examples calculated manually or type the intents in natural language to PowerSheet (can be optional!). PowerSheet accepts both examples and intents if necessary.

3. PowerSheet launches the built-in synthesizer. The synthesizer first determines whether the requirement should be fulfilled using an equation or a simple data (users can also express their preferences in step 2). If PowerSheet found potential solutions, it provides real-time previews for the execution results of equations and data on the user's worksheet.

4. If users are dissatisfied with current results, they can instruct the PowerSheet using natural language directives to amend the generated formulas, thus guide PowerSheet in achieving the desired functionality. PowerSheet will iteratively generates equations and scripts until they meet the user's requirements.

Task panes of "AI Generation"

PowerSheet generation dialog

The above figure shows the task pane of the AI Generation commands. In the demo video, we shows the process of a user using PowerSheet's AI Context Filling feature to complete a spreadsheet task. In this demo, the user expects to fill the selected worksheet area with the sum of its data. By selecting the original data and giving appropriate instructions to PowerSheet using natural language, the user easily completes the desired task.

3.1.2. AI Batch Processing

PowerSheet employs a large language model to free users from repetitive tasks. The AI Batch Processing feature allows users to express their intent in natural language and apply the intended operations to the selected range of cells. This feature has two main modes: selective highlighting and cell value processing. In the selective highlighting mode, PowerSheet highlights all cells that meet the user-provided requirements in natural language. In the value processing mode, PowerSheet applies the intended operation to a range of cells selected by the user. Despite the built-in conditional formatting feature in spreadsheet applications, PowerSheet does not use spreadsheet formulas to interact with the data model. Instead, it directly utilizes the software-provided JavaScript API to manipulate data, hiding the complex details of data operations from the user while offering more flexibility.

When using traditional spreadsheet formulas, users can only filter or process their data within the constraints of the restricted programming model. However, PowerSheet bypasses the formula abstraction provided by spreadsheet applications. Upon request, it retrieves the cell addresses and data from the API, and the LLM-based code synthesizer generates a JavaScript function based on user intent. The PowerSheet framework then validates the generated function and applies it to all cells in the selected range.

Task panes of "AI Batch Processing"

The above figure shows the task pane of the AI Batch Processing commands. In the demo video, we show examples of a user using PowerSheet to complete tasks such as highlighting even-numbered data and performing addition on all selected cells.

3.1.3. AI Comprehension and AI Checking

Based on the nature language output capability powered by the large language model, PowerSheet can help the user understand their spreadsheet documents, provide insight to the user, and look for compatibility issues between famous spreadsheet applications like Microsoft Excel, LibreOffice Calc and iWork.

For the AI Comprehension feature, users can select a range they are interested in, and PowerSheet will inspect the data within that range and provide an explanation of the relevant details. Users can provide additional instructions to guide the model in generating explanations that better align with their intent. Additionally, users can toggle the "Formula Explain" mode. In this mode, PowerSheet focuses more on the spreadsheet formulas rather than the evaluated results, helping users understand how the current spreadsheet is designed and works, especially for those unfamiliar with spreadsheet formulas.

For the AI Checking feature, PowerSheet examines all formulas in the spreadsheet. With compatibility information preset into the large language model's input, PowerSheet can inform users of potentially incompatible formulas and suggest modifications to compatible versions.

In the demo video, we show scenarios where a user uses PowerSheet to obtain interpretations of an existing spreadsheet or a specific formula within the spreadsheet.

Task panes of "AI Comprehension"

The above figure shows the task pane of the AI Comprehension and AI Checking commands.

4. PowerSheet Design

4.1. System Structure

The architecture of PowerSheet is divided into the Spreadsheet Plugin Frontend, the Large Language Model Proxy and the Model Runtime Backend (AMD Ryzen AI Runtime).

The interaction between the three components is shown in figure.

System Structure of PowerSheet

Spreadsheet Plugin Frontend: The frontend is the topmost layer of PowerSheet, which closely integrates with Microsoft Excel to provide a GUI interface for users. The ReactJS plugin in the frontend interacts with the user, and communicates with the second layer to receive cell values or make modifications to the spreadsheet data.

Large Language Model Proxy: The LLM Proxy acts as a bridge between the user interface and the underlying model, containing both an analyzer and proxy components. The analyzer processes the raw user requests, converting between structured data (cell values and formulas) and unstructured data (natural language). The proxy and the context manager maintain the state of PowerSheet, construct queries to the LLM, and handle reply processing.

Model Runtime Backend: The backend utilizes the AMD Ryzen AI capabilities to serve the large language model inference process. Based on PyTorch's dynamic graph mechanism, Model Runtime Backend categorizes operators into two types: those supported only by the CPU and those supported by both the NPU and CPU, assigning them to the optimal device.

4.2. Plugin Frontend Design

The frontend of plugin is based on ReactJS and Fluent UI. The frontend registers a command tab “PowerSheet” in the spreadsheet application. In the tab, the four main features are listed in each group, called “AI Generation”, “AI Comprehension”, “AI Checking” and “AI Batch Processing” respectively. Clicking one of the commands would trigger a task pane to open, which would guide the user for further interaction with PowerSheet.

The task pane is implemented using ReactJS. To enable PowerSheet's functionality, a cell selector allows users to easily specify the range of cells PowerSheet should operate on. Additionally, the frontend uses a connector with the spreadsheet application to get and set cell data. For Microsoft Excel, we use the Office JS connector. We anticipate that similar connectors are available for other platforms, making PowerSheet easily portable.

Once the frontend has collected enough information relevant to the current task, it serializes the data and sends it to the LLM Proxy. After the Ryzen AI processes the data, the LLM Proxy returns the results to the frontend, which then parses and displays the information to the user, making corresponding modifications to the specified spreadsheet data.

Workflow of Plugin Frontend

4.3. LLM Proxy Design

The Large Language Model Proxy (LLM Proxy) serves as an intermediary between the frontend and the NPU backend, providing data structuring and unstructuring functions. This component connects the world of structured and unstructured data, enabling the LLM to access spreadsheet data, formulas, and user-selected ranges. At the same time, spreadsheet applications can utilize the LLM's output for filling data, highlighting ranges, and executing JavaScript code.

The LLM Proxy interacts with the frontend via WebSocket. Upon receiving a user request, it fills predefined templates with values from the request body and caches the request in memory. Once a query is successfully constructed, it sends the request to the Ryzen AI backend and waits for a reply.

When the LLM Proxy receives a reply from the Ryzen AI backend, it first matches the reply with the previously cached request. Depending on the PowerSheet function selected by the user, it may process the input differently. For example, if the user selects the Content Fill function, it matches the <CELL></CELL> content from the reply and maps it to the corresponding cells in the original worksheet. Similarly, if the user selects compatibility checking, it matches <WARN></WARN> and <PASS></PASS>, displaying different categories of information accordingly. Each function's query generation corresponds to different templates, and the parsing process for model replies also varies. Detailed explanations of how the LLM Proxy handles structured and unstructured information from the frontend and the LLM are provided in the “Interaction Design” and “Parsing Model Replies” sections.

Workflow of LLM Proxy

4.4. Model Runtime Backend Design

The Model Runtime Backend is built on PyTorch's dynamic graph. It replaces operators supported by the NPU (such as nn.Linear) with operators provided by RyzenAI, offloading computation to the NPU. Operators that are not supported continue to use those provided by PyTorch (such as Embedding and RMSNorm), and are computed on the CPU. Specifically, for LlamaAttention, we have improved upon RyzenAI's LlamaFlashAttention to support the Group Query Attention mechanism of Llama3.

During quantization, supported nn.Module instances are replaced with their RyzenAI counterparts and then saved using torch.save. At runtime, since ryzenAI.QLinearPerGrp is a C++ object and does not support PyTorch's serialization, it is dynamically created. This ensures that the specialized components are correctly instantiated and operational when the model is deployed. This approach maintains the integrity and performance of the model by leveraging RyzenAI's optimizations while addressing the serialization limitations of certain components.

4.5. Interaction Design

PowerSheet directly parses the output of the LLM without imposing formal constraints during the model's inference process. This design allows PowerSheet to more easily adapt to different inference backends (such as PyTorch and ONNX Runtime). Additionally, as the performance of AMD Ryzen AI NPU models improves, we anticipate running larger parameter models on AMD AIPC's NPU. These models will be better able to control the format of their output through natural language instructions, further eliminating the need for additional constraints during output processing.

To better transform the LLM's output into a format recognizable by the frontend, the LLM Proxy predefines some instructions when inputting queries to the LLM to control the model's output. The following figure shows the predefined query templates for the Content Fill mode and the Formula By Example mode.

4.6. Glitches Auto Correction

Due to the size limitations of the Llama 2 7B model, there may be instances where the model output obtained by PowerSheet is not compliant. Such situations typically include: (1) the model's output formulas do not include a leading "=" symbol, causing spreadsheet applications to treat them as plain text; (2) The output of the model does not satisfy the formats that is required for parsing, such as the use of <CELL></CELL>.

PowerSheet will attempt to correct such issues to ensure that expected results can still be obtained in a single interaction even with smaller parameter models. For the Formula By Example mode, PowerSheet expects the model's output to always be a formula, so it checks all cell outputs and adds the "=" symbol to cells with incorrect formats. For the Content Fill mode, PowerSheet uses a simple keyword matching scheme to check whether the LLM-filled cells resemble a formula. If a cell content looks like a formula without the leading "=" sign, PowerSheet invokes a postprocessing method to correct the output. If PowerSheet does not find any predefined rules to fix the output, the frontend will directly display an error message to the user, suggesting they narrow the cell selection range or provide clearer instructions to improve the success rate of parsing the LLM output.

5. LLM Inference Optimization on Ryzen AI

5.1. KV-Cache Management

In LLM inference, KV-Cache is an important optimization technique. For autoregressive models, each iteration generates a new token based on the input tokens. In the attention calculation, the input is transformed into three matrix representations: Query (Q), Key (K), and Value (V). Then calculates the attention scores.

We typically use a causal mask to ensure only the tokens preceding the current token are visible when calculating the attention scores. As shown in the figure below, Some intermediate results (i.e. KV-Cache) can be reused in the next iteration, thereby reducing the amount of computation.

Reusing Intermediate Results

In the PowerSheet system, a complete conversation might be as follows:

system: You will be working with Excel Sheets. You should output the content of each cell, in column-major order, one line for a single cell. WRAP THE CELL CONTENT in <CELL></CELL> and only wrap them ONCE. You are expected to output at least 1 lines. You are encouraged to use formula if it is appliable. You should only generate one possible solution, and only output ONCE for each cell in <CELL></CELL>. Don't output the additional evaluated result of formulas.

user: I have an Excel sheet, and a section from A4 to A9. Now I want you to fill A10 with data or formula. I want to fill in the way that "sum them up".

assistant: <CELL> =SUM(A4:A9) </CELL>

user: I want average.

assistant: <CELL> = AVERAGE (A4:A9) </CELL>

The “system" prompt is added by PowerSheet, which remains constant across different conversations. Therefore, PowerSheet precomputes this part of the KVCache to speed up the process of multiple conversations. For multi-turn conversations, PowerSheet adopts the common practice of LLM Serving systems by caching the already generated KVCache.

5.2. Zero Irrelevant Search

PowerSheet uses constrained search and early stop techniques, ensuring that the LLM only outputs the structured text required by PowerSheet. We observe that only the output like <CELL></CELL>, <CODE></CODE> is needed in some cases. Systems like GENRE and LMQL ensure that the LLM can produce results that meet the expected constraints through Constrained Search. In PowerSheet, our constraints are relatively simple; we require the LLM's output to start with a specific prefix (e.g., <CELL>, <CODE>, etc.). Therefore, PowerSheet loads the prefix into the prompt.

As shown in the figure below, PowerSheet uses the Constrained Search with Prefix method, adding the <CELL> token after the <Assistant> token to prevent the LLM from generating irrelevant content, thereby improving the accuracy and efficiency of the generated content.

Constrained Search demo in PowerSheet

The Constrained Search with Prefix technique prevents the LLM from generating irrelevant content before the <CELL> token, but it may still generate irrelevant content after the </CELL> token. Generally, PowerSheet can anticipate the expected length of the LLM output (i.e., how many CELLs of information it will produce). Therefore, PowerSheet checks whether all valid information has been output after each token is generated, rather than waiting until the generated token length reaches max_tokens or an <EOS> token is generated.

The figure below illustrates the principle of the Early Stop technique. In this example, PowerSheet expects the LLM to output the content of one <CELL>. Therefore, when a complete pair of <CELL></CELL> is detected, PowerSheet stops generating new tokens. This approach allows for faster retrieval of the LLM-generated results, reducing the user's waiting time.

Early Stopping demo in PowerSheet

By combining the Constrained Search with Prefix and Early Stop techniques, PowerSheet almost never generates any irrelevant content for some functions, significantly improving the system's response speed.

5.3. Llama 3 Group Query Attention Support for Ryzen AI

Llama3 adopted grouped query attention (GQA) L to enhance inference performance. Despite Llama3-8B having 1 billion more parameters than Llama-7B, their inference performance is similar. However, even the latest version of `modelling_llama.py` provided by RyzenAI-SW, which is based on Transformers 4.37.2, does not support Llama3's GQA (Transformers 4.41.0 officially supports Llama3). Upon analyzing the changes from Llama2 to Llama3, we found that the main compatibility issue lies in the older version of `modeling_llama.py`, where attention only supports multi-head attention. As shown in the figure below, multi-head attention assumes that the number of heads for Key-Value is equal to the number of heads for Query, but this assumption does not hold in GQA. Therefore, we modified the implementation of `llama_flash_attention.py` to handle this situation.

Multi-Head Attention V.S. Group Query Attention

5.4. Activation-aware Weight Quantization

Activation-aware Weight Quantization (AWQ) is a popular LLM quantization scheme. AWQ is based on the observation that "some weights in a large model are more important than others." The idea is to determine the importance of weights based on activations' magnitude and protect 0.1-1% of the weights from quantization. Additionally, the scaling ratio is chosen based on the magnitude of the activation values. AWQ quantization ensures accuracy while reducing storage and computational requirements.

We use precomputed scaling from awq-model-zoo to perform weight int4 quantization on the llama3-instruct-8b model, with activation values in bfloat16 and a group size of 128.

5.5. Evaluation

We evaluated the effects of various optimizations using the same prompt on different hardware configurations. Tests were conducted on an AI PC with a 7840HS and servers with V100 16G and 32G. On the AI PC, inference was performed using AWQ W4ABF16, while on the V100 servers, inference used float16.

We found that the performance of the 7840HS was comparable to that of the V100 16G. This similarity in performance is because Llama3 8B, when quantized to float16, cannot be fully loaded into the GPU memory.

Additionally, by applying various optimizations to reduce the token length required for generation, we achieved a generation speed of 2 tokens per second. This speed is considered acceptable for users in many scenarios (as demonstrated in our demo video).

Evaluation Result

6. Deployment

6.1. Prerequisite

1. Microsoft Office >= 2021

To effectively utilize the application in the world, ensure that you have a version of Microsoft Excel higher than 2021, as this is the minimum version capable of running ReactJS in the task pane.

To get started, you need the latest stable version of Node.js and the Yeoman generator, which is required to create an Office plugin project. Detailed instructions are available in the "Set up your development environment" section.

2. RyzenAI NPU Driver

To utilize the NPU functionality, you need to download the RyzenAI NPU Driver.

Version 1.1: This older version was released six months ago as of August 1, 2024.
Version 1.2: This newer version was released two days ago.

Both versions of the NPU Driver have been thoroughly tested. We recommend using the RyzenAI NPU Driver 1.2. In our tests, it showed a 20% performance improvement compared to version 1.1.

As an example with version 1.2, after installing Visual Studio 2022's Desktop C++ according to the official tutorial, download the NPU Driver installation package and extract it. Then, start CMD as an administrator (Win + X, select Terminal (admin), then type cmd), navigate to the extracted path using 'cd', and then execute the installation by typing ' npu_sw_installer.exe '. If everything is working correctly, the output will be as shown in the figure below.

3. (Optional) Remove Windows Path Length Limit

If you encounter path length limit issues in subsequent steps, you can remove the limit by opening PowerShell as an administrator (Win + X, then select Terminal (admin)). Then, enter the following command:

Set-ItemProperty -Path 'HKLM:\SYSTEM\CurrentControlSet\Control\FileSystem' -Name LongPathsEnabled -Type DWord -Value 1

For reference, see the Stack Overflow discussion: Why does the 260 character path length limit exist in Windows?

6.2. Obtain PowerSheet Code

You can create your plugin by following the instructions in the "Build an Excel task pane add-in" guide. For convenience, you can directly download the code for the PowerSheet Plugin, which serves as the frontend of the entire project.

To obtain the PowerSheet code, simply use the git clone command.

git clone https://github.com/PowerSheet-Team/PowerSheet

6.3. Install Dependency

1. Build and install RyzenAI OPs.

To utilize RyzenAI, you need to install the RyzenAI-SW environment. Although RyzenAI has released version 1.2 of RyzenAI-SW, our tests showed almost no improvement in the performance of Llama3-8b. Therefore, we are still using version 1.1 of RyzenAI-SW. (Yes, using RyzenAI-SW 1.1, RyzenAI-SW 1.1 can run under RyzenAI NPU Driver 1.2. In our tests, there were some issues with RyzenAI-SW's support for Llama3.)

# should execute those command in cmd, not powershell
# we don’t need git lfs
set GIT_LFS_SKIP_SMUDGE=1
git clone https://github.com/amd/RyzenAI-SW -b 1.1
cd RyzenAI-SW/example/transformers
conda env create --file=env.yaml
./setup.bat
pip install ops\cpp --force-reinstall

2. Install patched transformers.

RyzenAI-SW 1.1 is tightly integrated with transformers 4.34.0. However, this version lacks the functionality to export the KV Cache generated during the generation process. To address this issue, we need to manually cherry-pick the relevant commits and install a patched version of transformers.

# should execute those command in cmd, not powershell
conda activate ryzenai-transformers
wget https://patch-diff.githubusercontent.com/raw/huggingface/transformers/pull/25086.diff -o 25086.diff
git clone https://github.com/huggingface/transformers.git
cd transformers
git checkout v4.34.0
git apply ../25086.diff
pip install --no-dependencies .
# to resolve cannot to import name ‘split_torch_state_dict_into_shards’ from huggingface_hub
pip install huggingface_hub==0.24.0

Or simplily install our fork:

conda activate ryzenai-transformers
git clone https://github.com/PowerSheet-Team/transformers
cd transformers
pip install --no-dependencies .
# to resolve cannot to import name ‘split_torch_state_dict_into_shards’ from huggingface_hub
pip install huggingface_hub==0.24.0

6.4. Model Quantization

1. Activate the conda environment and set the environment variables.

# should execute those command in cmd, not powershell
conda activate ryzenai-transformers
"RyzenAI-SW/example/transformers/setup.bat"

2. Download Llama into the models folder and perform the quantization process.

The Llama3 model requires permission from Meta, so we need a Hugging Face account to apply for permission and obtain your HF tokens. We use Activation-aware Weight Quantization (AWQ) to quantize the model. To accelerate the execution process, we need to download the precomputed scales.

cd PowerSheet/models
huggingface-cli download --token hf_*** --resume-download meta-llama/Meta-Llama-3-8B-Instruct --local-dir ./Meta-Llama-3-8B-Instruct
wget https://huggingface.co/datasets/mit-han-lab/awq-model-zoo/resolve/main/llama3-instruct-8b-w4-g128.pt -o ./llama3-instruct-8b-w4-g128.pt

3. Perform the quantization operations. To ensure a smooth completion of the process, please make sure that your RyzenAI PC has at least 24GB of RAM.

cd PowerSheet/platform
python quantize.py

4. After the quantization completed, we can conduct multi-turn chat tests on our platform.

cd PowerSheet/platform
python test_llm_npu.py

If the test succeed, you will get output “Respond: <CELL> AVERAGE(A4:A9)</CELL>”

6.5. Build and Run Plugin Frontend

1. Navigate to the frontend directory, and use npm command to launch the frontend.

cd PowerSheet/frontend
npm start

2. The office-addin-debugging package will be asking you whether to allow localhost loopback for Microsoft Edge WebView. Since we does not need this feature, type “n” and press Enter.

3. A Excel window and the greeting message will be shown. Click the “PowerSheet” tab and ensure that every features are shown in the command list.

4. When Clicking on a command for the first time, you may receive a “WebView Stop On Load” pop-up. Click “Cancel” since we do not need to debug the plugin this time.

6.6. Putting Things Together

Follow the above instructions to activate the ryzenai-transformers environment and run setup.bat, then:

cd PowerSheet/backend
python main.py

You will see the loading process of Ryzen AI. When the loading is complete, PowerSheet is ready for you. 🌈

7. Difference with Existing Solutions

The spreadsheet automation provided by PowerSheet integrates JavaScript code generation with the power offered by LLMs, which differs from existing methods based on Version Space Algebra (VSA) for FlashFill and direct code mapping from natural language to programs using LLMs.

1. Differences from FlashFill: FlashFill is an automated solution for generating Excel formulas proposed by Microsoft Research. It requires users to specify a segment of input-output examples in the table, and it guesses user intents through complex strategies, then generate formulas accordingly. Although this method runs fast, it is only suitable for simple data processing (such as inferring appropriate mathematical expressions and string operations, etc.). In contrast, PowerSheet not only allows users to specify input-output examples but also enables users to describe their intents in simple natural language, guiding the synthesizer to generate more accurate results. Besides, PowerSheet supports iterative refinement based on user feedback, allowing users to preview results in real-time and instruct the AI to modify equations or scripts until the results match their expectation.

2. Differences from existing LLM-based methods: Benefit from online LLM services like GPT Excel or Office 365 Copilot, users can convert their intents for data processing into equations and scripts using natural language. However, these services have limitations: a) Equations and scripts are often tied to the structure of the spreadsheet (for example, the location of data to be processed may be column A1 and A2), including this metadata in the natural language description is cumbersome; b) Describing intents using natural language is often more complex than providing a few input-output examples. Online LLM services usually lack the ability to read examples unless users design prompts and manually input data; c) Descriptions of users' intent may be inaccurate. existing online LLM services is unable to correct results based on user feedback or newly provided examples, requiring users to modify their initial prompt and try repeatedly; d) Users might not be clear whether their intents should be addressed using simple equations or scripts. Existing solutions may failed to create equations if the requirement of the user is hard to be satisfied by equations only; e) Online LLM services may invoke ChatGPT or Bing AI backends, requiring additional fees from users and exposing the risk of data leak; f) Existing solutions usually have poor integration with spreadsheet applications or are developed for specific software only, lacking a universal interface. This limits users' ability to utilize AI assistance in their familiar spreadsheet applications.

Leveraging the capabilities of LLMs and formal methods, PowerSheet can assist users in creating complex spreadsheet automation in a short time, enhancing work efficiency, and reducing the likelihood of programming errors. For normal users, it will be way easier to create non-trivial equations to simplify their workflows. For advanced users, complex equations and scripts can be generated, evaluated and finetuned according to their needs, saving them from frustrating programming works.

References

[1] J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyanskiy, F. Lebrón, and S. Sanghai, “GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints, ” Dec. 23, 2023, arXiv: arXiv:2305.13245. Accessed: Aug. 01, 2024. [Online]. Available: http://arxiv.org/abs/2305.13245

[2] L. Beurer-Kellner, M. Fischer, and M. Vechev, “Large Language Models are Zero-Shot Multi-Tool Users”.

[3] T. Dao, “FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning, ” Jul. 17, 2023, arXiv: arXiv:2307.08691. Accessed: Aug. 01, 2024. [Online]. Available: http://arxiv.org/abs/2307.08691

[4] N. De Cao, G. Izacard, S. Riedel, and F. Petroni, “Autoregressive Entity Retrieval, ” Mar. 24, 2021, arXiv: arXiv:2010.00904. Accessed: Jul. 31, 2024. [Online]. Available: http://arxiv.org/abs/2010.00904

* Icons of the PowerSheet comes from Yannick.