Problem
Solution
NPU/IPU PC settings and installation
Software environment and dependencies
Preparing environment and packages for Ryzen AI Transformers
Download and test Llama 2
Evaluation

Published July 30, 2024 © GPL3+

HacksterMate: Offline AI Assistant for Makers

HacksterMate is an offline AI assistant utilizing the Llama 2 model on an AMD IPU to aid makers with building projects.

IntermediateFull instructions provided10 hours319

HacksterMate: Offline AI Assistant for Makers

Things used in this project

Hardware components

Minisforum Venus UM790 Pro with AMD Ryzen™ 9

Software apps and online services

Microsoft Visual Studio 2019

Python 3.9

PyTorch

Anaconda/Miniconda

AMD Ryzen AI

Huggingface

Llama 2 7B

Story

Problem

Makers frequently encounter complex challenges across domains like microcontroller programming, CNC machining, and mechatronic system integration. Efficient access to relevant information is essential, yet often compromised by unreliable internet connectivity or the inefficiencies of extensive online searches. There is a need for a versatile, offline tool that offers expert guidance and support in real-time, independent of cloud-based solutions.

Solution

HacksterMate is an AI-powered assistant designed for makers and engineers. Utilizing the Llama 2 language model and operating locally on an AMD IPU, HacksterMate enhances the efficiency of maker projects. With AMD’s Ryzen AI family of processors integrating a Neural Processing Unit (NPU), the CPU and GPU are freed up for other tasks, improving power efficiency. Built on the XDNA architecture, this technology is purpose-built to run AI workloads locally. Whether you are debugging Arduino code, optimizing CNC machine operations, or managing complex mechatronic systems, HacksterMate provides precise, context-aware support. This project eliminates the need for online searches, offering a seamless, offline experience with a reliable AI companion that works with you to support your technical needs.

NPU/IPU PC settings and installation

For this project you must use a Ryzen AI enabled PC. The UM790 Pro that I'm using is equiped with the Ryzen 9 7940HS Processor with built in NPU, however the NPU disabled in the BIOS by default, so you may need to enable the NPU first. To do so, you can follow the instructions with this link: https://www.hackster.io/512342/amd-pervasive-ai-developer-contest-pc-ai-study-guide-6b49d8#toc-appendix-7

Once you have followed those steps, then you should download the IPU driver here: https://ryzenai.docs.amd.com/en/latest/inst.html

Once downloaded, extract the folder, then open a terminal in administrator mode go to the directory and execute the .\amd_install_kipudrv.bat bat file.

Once the batch file is properly executed, you can check by opening Device Manager -> System Devices -> AMD IPU Device and you will see this:

Software environment and dependencies

Begin by installing the necessary dependencies, including Python 3.9, Microsoft Visual Studio 2019, and either Anaconda or Miniconda for managing environments. Additionally, you'll need to install CMake for building the project and PyTorch for running the AI models.

Once you got Python 3.9, Visual Studio 2019, CMake, and Anaconda/Miniconda installed. You need to add the Scripts and lib\bin to PATH.

To add the Scripts and lib\bin directories of your Conda environment to your PATH on Windows, first locate your Conda environment path. This is typically found at C:\Users\<YourUsername>\anaconda3\envs\<YourEnvName>.

Next, modify your PATH variable. Open the Start Menu, search for Environment Variables, and select Edit the system environment variables. In the System Properties window, click on Environment Variables. In the Environment Variables window, find the Path variable under System variables. select it, and click Edit. In the Edit Environment Variable window, click New and add the paths to your Conda environment’s Scripts and Library\bin directories.

C:\Users\<YourUsername>\anaconda3\envs\<YourEnvName>\Scripts
C:\Users\<YourUsername>\anaconda3\envs\<YourEnvName>\Library\bin

Click OK to close each window.

Finally, verify the changes. Open a new Command Prompt window and type

echo %PATH%

to ensure the new paths have been added. This process ensures that the Scripts and lib\bin directories are included in your system PATH, allowing you to run commands and binaries from your Conda environment anywhere in the Command Prompt.

For the next step you should install the Ryzen AI Software. You can download it via this link: https://ryzenai.docs.amd.com/en/latest/inst.html#install-the-ryzen-ai-software

Once you have followed the instruction, you should already created a conda environment with all the necessary packages installed. You can activate and initialize by the following:

conda activate <YourEnvName>
conda init

Preparing environment and packages for Ryzen AI Transformers

We need to install git in order to be able to run the Ryzen AI examples and this project by the following:

conda install anaconda::git

Once Git installed:

git lfs install
git clone https://github.com/amd/RyzenAI-SW.git
cd RyzenAI-SW
git lfs pull

Now let's open the transformers directory and build a new conda environment with packages for Ryzen AI transformers.

cd RyzenAI-SW\example\transformers
conda env create --file=env.yaml
conda activate ryzenai-transformers

AWQ Model zoo has precomputed scales, clips and zeros for various LLMs including OPT, Llama. Get the precomputed results:

git lfs install
cd RyzenAI-SW\example\transformers\ext
git clone https://huggingface.co/datasets/mit-han-lab/awq-model-zoo awq_cache

Setup environment by executing the batch file.

cd RyzenAI-SW\example\transformers\ 
setup.bat

Build dependencies (note: you may have to redo this step and the step above again if you run into some errors)

pip install ops\cpp --force-reinstall

For more details on transformer models on Ryzen AI, check here: https://github.com/amd/RyzenAI-SW/blob/main/example/transformers/README.md

Download and test Llama 2

Download the Llama 2 7B chat assistant model from hugging face using this link: https://huggingface.co/meta-llama/Llama-2-7b-chat-hf

To do so, you will need to create a HuggingFace account and then you will be asked to apply to Meta in order to get permission to download. Once you are able to download, follow the git command on the model page and login as prompted.

You will need to install huggingface-cli

pip install -U "huggingface_hub[cli]"

You can follow the instructions here:

https://huggingface.co/docs/huggingface_hub/guides/cli

Once finished, you can quantize and run inference for the run_awq.py example.

Quantize:

cd RyzenAI-SW\example\transformers\models\llama2
git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
move  Llama-2-7b-chat-hf 7B_chat
mkdir llama-2-wts-hf
move 7B_chat llama-2-wts-hf
python run_awq.py --w_bit 4 --task quantize

Inference:

python run_awq.py --task decode --target aie --w_bit 4

Learn more about the steps here: https://github.com/amd/RyzenAI-SW/blob/main/example/transformers/models/llama2/README.MD

This example will give you the output as the following:

Logs:

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>python run_awq.py --w_bit 4 --task quantize
Traceback (most recent call last):
  File "C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2\run_awq.py", line 14, in <module>
    import qlinear
ModuleNotFoundError: No module named 'qlinear'

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>cd ..

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models>cd ..

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>ls
CMakeLists.txt  amd_drivers_install.bat  dll   env.yaml  llm-AIE.drawio.png    models    ops        tests        tools
README.md       awq_cache                docs  ext       llm-flow1.drawio.png  onnx-ops  setup.bat  third_party  xclbin

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>setup.bat

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PWD=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET THIRD_PARTY=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET TVM_LIBRARY_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PATH=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\\Extensions\Microsoft\IntelliCode\CLI;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Current\bin\Roslyn;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Team Tools\Performance Tools\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\\x64;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\devinit;C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\MSBuild\Current\Bin;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\mingw64\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\mingw-w64\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\usr\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Scripts;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\bin;C:\Users\s4mue\miniconda3\condabin;C:\Windows\System32\AMD;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Users\s4mue\AppData\Local\Microsoft\WindowsApps;C:\Program Files\CMake\bin;C:\Users\s4mue\AppData\Local\Programs\Python\Python311\Scripts;C:\Users\s4mue\AppData\Local\Programs\Python\Python311;C:\Users\s4mue\AppData\Local\Microsoft\WindowsApps;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Scripts;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers;C:\Users\s4mue\miniconda3;C:\Program Files\Git\bin;C:\Users\s4mue\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\voe-4.0-win_amd64;C:\Users\s4mue\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\onnxruntime\bin;.;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\cpp\;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTORCH_AIE_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\utils

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\utils;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\kernels

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET AWQ_CACHE=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\awq_cache\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set XRT_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\xrt-ipu

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set TARGET_DESIGN=

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set DEVICE=phx

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set XLNX_VART_FIRMWARE=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\/xclbin/phx

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>pip install ops\cpp --force-reinstall
Processing c:\users\s4mue\documents\working-folder\ryzenai-sw\example\transformers\ops\cpp
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: RyzenAI
  Building wheel for RyzenAI (pyproject.toml) ... done
  Created wheel for RyzenAI: filename=RyzenAI-0.0.1-cp39-cp39-win_amd64.whl size=181541 sha256=e8d206d2538b11cbd305a833076346a4768181a9131ebe7146ac472306e77603
  Stored in directory: C:\Users\s4mue\AppData\Local\Temp\pip-ephem-wheel-cache-w37b3009\wheels\b1\54\f9\cccec4ac1a03ead43a782fd25fab8506e2c7736732fc8758f7
Successfully built RyzenAI
Installing collected packages: RyzenAI
  Attempting uninstall: RyzenAI
    Found existing installation: RyzenAI 0.0.1
    Uninstalling RyzenAI-0.0.1:
      Successfully uninstalled RyzenAI-0.0.1
Successfully installed RyzenAI-0.0.1

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>ls
CMakeLists.txt  amd_drivers_install.bat  dll   env.yaml  llm-AIE.drawio.png    models    ops        tests        tools
README.md       awq_cache                docs  ext       llm-flow1.drawio.png  onnx-ops  setup.bat  third_party  xclbin

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>cd models\llama2

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>python run_awq.py --w_bit 4 --task quantize
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
Namespace(dataset='raw', w_bit=4, awq='load', target='cpu', task='quantize', flash_attention=False, lm_head=False, num_torch_threads=8)
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:13<00:00,  6.97s/it]
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 12916.516MB


Loading pre-computed AWQ results from C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\awq_cache\
Quantization config: {'zero_point': True, 'q_group_size': 128}
real weight quantization...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [05:55<00:00, 11.11s/it]

**** Model size: 6965.766MB


Model transformation: Replacing <class 'qmodule.WQLinear'> layers with <class 'qlinear.QLinearPerGrp'> ...
Model transformation done!: Replaced 224 <class 'qmodule.WQLinear'> layers with <class 'qlinear.QLinearPerGrp'>.
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 564.516MB


Quantized and saved model: pytorch_llama27b_w_bit_4_awq_amd.pt

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>python run_awq.py --task decode --target aie --w_bit 4
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
Namespace(dataset='raw', w_bit=4, awq='load', target='aie', task='decode', flash_attention=False, lm_head=False, num_torch_threads=8)
Loading from ckpt: pytorch_llama27b_w_bit_4_awq_amd.pt

**** Model size: 564.516MB


LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)
Preparing weights of layer : model.layers.0.self_attn.q_proj
Preparing weights of layer : model.layers.0.self_attn.k_proj
Preparing weights of layer : model.layers.0.self_attn.v_proj
Preparing weights of layer : model.layers.0.self_attn.o_proj
Preparing weights of layer : model.layers.0.mlp.gate_proj
Preparing weights of layer : model.layers.0.mlp.up_proj
Preparing weights of layer : model.layers.0.mlp.down_proj
Preparing weights of layer : model.layers.1.self_attn.q_proj
Preparing weights of layer : model.layers.1.self_attn.k_proj
Preparing weights of layer : model.layers.1.self_attn.v_proj
Preparing weights of layer : model.layers.1.self_attn.o_proj
Preparing weights of layer : model.layers.1.mlp.gate_proj
Preparing weights of layer : model.layers.1.mlp.up_proj
Preparing weights of layer : model.layers.1.mlp.down_proj
Preparing weights of layer : model.layers.2.self_attn.q_proj
Preparing weights of layer : model.layers.2.self_attn.k_proj
Preparing weights of layer : model.layers.2.self_attn.v_proj
Preparing weights of layer : model.layers.2.self_attn.o_proj
Preparing weights of layer : model.layers.2.mlp.gate_proj
Preparing weights of layer : model.layers.2.mlp.up_proj
Preparing weights of layer : model.layers.2.mlp.down_proj
Preparing weights of layer : model.layers.3.self_attn.q_proj
Preparing weights of layer : model.layers.3.self_attn.k_proj
Preparing weights of layer : model.layers.3.self_attn.v_proj
Preparing weights of layer : model.layers.3.self_attn.o_proj
Preparing weights of layer : model.layers.3.mlp.gate_proj
Preparing weights of layer : model.layers.3.mlp.up_proj
Preparing weights of layer : model.layers.3.mlp.down_proj
Preparing weights of layer : model.layers.4.self_attn.q_proj
Preparing weights of layer : model.layers.4.self_attn.k_proj
Preparing weights of layer : model.layers.4.self_attn.v_proj
Preparing weights of layer : model.layers.4.self_attn.o_proj
Preparing weights of layer : model.layers.4.mlp.gate_proj
Preparing weights of layer : model.layers.4.mlp.up_proj
Preparing weights of layer : model.layers.4.mlp.down_proj
Preparing weights of layer : model.layers.5.self_attn.q_proj
Preparing weights of layer : model.layers.5.self_attn.k_proj
Preparing weights of layer : model.layers.5.self_attn.v_proj
Preparing weights of layer : model.layers.5.self_attn.o_proj
Preparing weights of layer : model.layers.5.mlp.gate_proj
Preparing weights of layer : model.layers.5.mlp.up_proj
Preparing weights of layer : model.layers.5.mlp.down_proj
Preparing weights of layer : model.layers.6.self_attn.q_proj
Preparing weights of layer : model.layers.6.self_attn.k_proj
Preparing weights of layer : model.layers.6.self_attn.v_proj
Preparing weights of layer : model.layers.6.self_attn.o_proj
Preparing weights of layer : model.layers.6.mlp.gate_proj
Preparing weights of layer : model.layers.6.mlp.up_proj
Preparing weights of layer : model.layers.6.mlp.down_proj
Preparing weights of layer : model.layers.7.self_attn.q_proj
Preparing weights of layer : model.layers.7.self_attn.k_proj
Preparing weights of layer : model.layers.7.self_attn.v_proj
Preparing weights of layer : model.layers.7.self_attn.o_proj
Preparing weights of layer : model.layers.7.mlp.gate_proj
Preparing weights of layer : model.layers.7.mlp.up_proj
Preparing weights of layer : model.layers.7.mlp.down_proj
Preparing weights of layer : model.layers.8.self_attn.q_proj
Preparing weights of layer : model.layers.8.self_attn.k_proj
Preparing weights of layer : model.layers.8.self_attn.v_proj
Preparing weights of layer : model.layers.8.self_attn.o_proj
Preparing weights of layer : model.layers.8.mlp.gate_proj
Preparing weights of layer : model.layers.8.mlp.up_proj
Preparing weights of layer : model.layers.8.mlp.down_proj
Preparing weights of layer : model.layers.9.self_attn.q_proj
Preparing weights of layer : model.layers.9.self_attn.k_proj
Preparing weights of layer : model.layers.9.self_attn.v_proj
Preparing weights of layer : model.layers.9.self_attn.o_proj
Preparing weights of layer : model.layers.9.mlp.gate_proj
Preparing weights of layer : model.layers.9.mlp.up_proj
Preparing weights of layer : model.layers.9.mlp.down_proj
Preparing weights of layer : model.layers.10.self_attn.q_proj
Preparing weights of layer : model.layers.10.self_attn.k_proj
Preparing weights of layer : model.layers.10.self_attn.v_proj
Preparing weights of layer : model.layers.10.self_attn.o_proj
Preparing weights of layer : model.layers.10.mlp.gate_proj
Preparing weights of layer : model.layers.10.mlp.up_proj
Preparing weights of layer : model.layers.10.mlp.down_proj
Preparing weights of layer : model.layers.11.self_attn.q_proj
Preparing weights of layer : model.layers.11.self_attn.k_proj
Preparing weights of layer : model.layers.11.self_attn.v_proj
Preparing weights of layer : model.layers.11.self_attn.o_proj
Preparing weights of layer : model.layers.11.mlp.gate_proj
Preparing weights of layer : model.layers.11.mlp.up_proj
Preparing weights of layer : model.layers.11.mlp.down_proj
Preparing weights of layer : model.layers.12.self_attn.q_proj
Preparing weights of layer : model.layers.12.self_attn.k_proj
Preparing weights of layer : model.layers.12.self_attn.v_proj
Preparing weights of layer : model.layers.12.self_attn.o_proj
Preparing weights of layer : model.layers.12.mlp.gate_proj
Preparing weights of layer : model.layers.12.mlp.up_proj
Preparing weights of layer : model.layers.12.mlp.down_proj
Preparing weights of layer : model.layers.13.self_attn.q_proj
Preparing weights of layer : model.layers.13.self_attn.k_proj
Preparing weights of layer : model.layers.13.self_attn.v_proj
Preparing weights of layer : model.layers.13.self_attn.o_proj
Preparing weights of layer : model.layers.13.mlp.gate_proj
Preparing weights of layer : model.layers.13.mlp.up_proj
Preparing weights of layer : model.layers.13.mlp.down_proj
Preparing weights of layer : model.layers.14.self_attn.q_proj
Preparing weights of layer : model.layers.14.self_attn.k_proj
Preparing weights of layer : model.layers.14.self_attn.v_proj
Preparing weights of layer : model.layers.14.self_attn.o_proj
Preparing weights of layer : model.layers.14.mlp.gate_proj
Preparing weights of layer : model.layers.14.mlp.up_proj
Preparing weights of layer : model.layers.14.mlp.down_proj
Preparing weights of layer : model.layers.15.self_attn.q_proj
Preparing weights of layer : model.layers.15.self_attn.k_proj
Preparing weights of layer : model.layers.15.self_attn.v_proj
Preparing weights of layer : model.layers.15.self_attn.o_proj
Preparing weights of layer : model.layers.15.mlp.gate_proj
Preparing weights of layer : model.layers.15.mlp.up_proj
Preparing weights of layer : model.layers.15.mlp.down_proj
Preparing weights of layer : model.layers.16.self_attn.q_proj
Preparing weights of layer : model.layers.16.self_attn.k_proj
Preparing weights of layer : model.layers.16.self_attn.v_proj
Preparing weights of layer : model.layers.16.self_attn.o_proj
Preparing weights of layer : model.layers.16.mlp.gate_proj
Preparing weights of layer : model.layers.16.mlp.up_proj
Preparing weights of layer : model.layers.16.mlp.down_proj
Preparing weights of layer : model.layers.17.self_attn.q_proj
Preparing weights of layer : model.layers.17.self_attn.k_proj
Preparing weights of layer : model.layers.17.self_attn.v_proj
Preparing weights of layer : model.layers.17.self_attn.o_proj
Preparing weights of layer : model.layers.17.mlp.gate_proj
Preparing weights of layer : model.layers.17.mlp.up_proj
Preparing weights of layer : model.layers.17.mlp.down_proj
Preparing weights of layer : model.layers.18.self_attn.q_proj
Preparing weights of layer : model.layers.18.self_attn.k_proj
Preparing weights of layer : model.layers.18.self_attn.v_proj
Preparing weights of layer : model.layers.18.self_attn.o_proj
Preparing weights of layer : model.layers.18.mlp.gate_proj
Preparing weights of layer : model.layers.18.mlp.up_proj
Preparing weights of layer : model.layers.18.mlp.down_proj
Preparing weights of layer : model.layers.19.self_attn.q_proj
Preparing weights of layer : model.layers.19.self_attn.k_proj
Preparing weights of layer : model.layers.19.self_attn.v_proj
Preparing weights of layer : model.layers.19.self_attn.o_proj
Preparing weights of layer : model.layers.19.mlp.gate_proj
Preparing weights of layer : model.layers.19.mlp.up_proj
Preparing weights of layer : model.layers.19.mlp.down_proj
Preparing weights of layer : model.layers.20.self_attn.q_proj
Preparing weights of layer : model.layers.20.self_attn.k_proj
Preparing weights of layer : model.layers.20.self_attn.v_proj
Preparing weights of layer : model.layers.20.self_attn.o_proj
Preparing weights of layer : model.layers.20.mlp.gate_proj
Preparing weights of layer : model.layers.20.mlp.up_proj
Preparing weights of layer : model.layers.20.mlp.down_proj
Preparing weights of layer : model.layers.21.self_attn.q_proj
Preparing weights of layer : model.layers.21.self_attn.k_proj
Preparing weights of layer : model.layers.21.self_attn.v_proj
Preparing weights of layer : model.layers.21.self_attn.o_proj
Preparing weights of layer : model.layers.21.mlp.gate_proj
Preparing weights of layer : model.layers.21.mlp.up_proj
Preparing weights of layer : model.layers.21.mlp.down_proj
Preparing weights of layer : model.layers.22.self_attn.q_proj
Preparing weights of layer : model.layers.22.self_attn.k_proj
Preparing weights of layer : model.layers.22.self_attn.v_proj
Preparing weights of layer : model.layers.22.self_attn.o_proj
Preparing weights of layer : model.layers.22.mlp.gate_proj
Preparing weights of layer : model.layers.22.mlp.up_proj
Preparing weights of layer : model.layers.22.mlp.down_proj
Preparing weights of layer : model.layers.23.self_attn.q_proj
Preparing weights of layer : model.layers.23.self_attn.k_proj
Preparing weights of layer : model.layers.23.self_attn.v_proj
Preparing weights of layer : model.layers.23.self_attn.o_proj
Preparing weights of layer : model.layers.23.mlp.gate_proj
Preparing weights of layer : model.layers.23.mlp.up_proj
Preparing weights of layer : model.layers.23.mlp.down_proj
Preparing weights of layer : model.layers.24.self_attn.q_proj
Preparing weights of layer : model.layers.24.self_attn.k_proj
Preparing weights of layer : model.layers.24.self_attn.v_proj
Preparing weights of layer : model.layers.24.self_attn.o_proj
Preparing weights of layer : model.layers.24.mlp.gate_proj
Preparing weights of layer : model.layers.24.mlp.up_proj
Preparing weights of layer : model.layers.24.mlp.down_proj
Preparing weights of layer : model.layers.25.self_attn.q_proj
Preparing weights of layer : model.layers.25.self_attn.k_proj
Preparing weights of layer : model.layers.25.self_attn.v_proj
Preparing weights of layer : model.layers.25.self_attn.o_proj
Preparing weights of layer : model.layers.25.mlp.gate_proj
Preparing weights of layer : model.layers.25.mlp.up_proj
Preparing weights of layer : model.layers.25.mlp.down_proj
Preparing weights of layer : model.layers.26.self_attn.q_proj
Preparing weights of layer : model.layers.26.self_attn.k_proj
Preparing weights of layer : model.layers.26.self_attn.v_proj
Preparing weights of layer : model.layers.26.self_attn.o_proj
Preparing weights of layer : model.layers.26.mlp.gate_proj
Preparing weights of layer : model.layers.26.mlp.up_proj
Preparing weights of layer : model.layers.26.mlp.down_proj
Preparing weights of layer : model.layers.27.self_attn.q_proj
Preparing weights of layer : model.layers.27.self_attn.k_proj
Preparing weights of layer : model.layers.27.self_attn.v_proj
Preparing weights of layer : model.layers.27.self_attn.o_proj
Preparing weights of layer : model.layers.27.mlp.gate_proj
Preparing weights of layer : model.layers.27.mlp.up_proj
Preparing weights of layer : model.layers.27.mlp.down_proj
Preparing weights of layer : model.layers.28.self_attn.q_proj
Preparing weights of layer : model.layers.28.self_attn.k_proj
Preparing weights of layer : model.layers.28.self_attn.v_proj
Preparing weights of layer : model.layers.28.self_attn.o_proj
Preparing weights of layer : model.layers.28.mlp.gate_proj
Preparing weights of layer : model.layers.28.mlp.up_proj
Preparing weights of layer : model.layers.28.mlp.down_proj
Preparing weights of layer : model.layers.29.self_attn.q_proj
Preparing weights of layer : model.layers.29.self_attn.k_proj
Preparing weights of layer : model.layers.29.self_attn.v_proj
Preparing weights of layer : model.layers.29.self_attn.o_proj
Preparing weights of layer : model.layers.29.mlp.gate_proj
Preparing weights of layer : model.layers.29.mlp.up_proj
Preparing weights of layer : model.layers.29.mlp.down_proj
Preparing weights of layer : model.layers.30.self_attn.q_proj
Preparing weights of layer : model.layers.30.self_attn.k_proj
Preparing weights of layer : model.layers.30.self_attn.v_proj
Preparing weights of layer : model.layers.30.self_attn.o_proj
Preparing weights of layer : model.layers.30.mlp.gate_proj
Preparing weights of layer : model.layers.30.mlp.up_proj
Preparing weights of layer : model.layers.30.mlp.down_proj
Preparing weights of layer : model.layers.31.self_attn.q_proj
Preparing weights of layer : model.layers.31.self_attn.k_proj
Preparing weights of layer : model.layers.31.self_attn.v_proj
Preparing weights of layer : model.layers.31.self_attn.o_proj
Preparing weights of layer : model.layers.31.mlp.gate_proj
Preparing weights of layer : model.layers.31.mlp.up_proj
Preparing weights of layer : model.layers.31.mlp.down_proj
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:None, device:aie, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:None, device:aie, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 564.512MB


Warming up ...
Warm up DONE!!
****************************************
prompt: What is the meaning of life?
response: What is the meaning of life?
The question of the meaning of life is a philosoph
****************************************
prompt: Tell me something you don't know.
response: Tell me something you don't know.

I don't know if you're
****************************************
prompt: What does Xilinx do?
response: What does Xilinx do?
Xilinx is a leading provider of programm
****************************************
prompt: What is the mass of earth?
response: What is the mass of earth?

The mass of Earth is approximately 5.
****************************************
prompt: What is a poem?
response: What is a poem?

A poem is a piece of writing that uses
****************************************
prompt: What is recursion?
response: What is recursion?

Recursion is a programming technique where a
****************************************
prompt: Tell me a one line joke.
response: Tell me a one line joke.

Here is a one-liner for you
****************************************
prompt: Who is Gilgamesh?
response: Who is Gilgamesh?
Gilgamesh is a legendary king
****************************************
prompt: Tell me something about cryptocurrency.
response: Tell me something about cryptocurrency.
Cryptocurrency is a digital or virtual currency
****************************************
prompt: How did it all begin?
response: How did it all begin?

The concept of the "end times" has
Number of prompts found in log: 10
Example#:1      Prompt-len:8    New-tokens-generated:11 Total-time:4.330s       Prefill-phase:603.583ms Time/token:369ms        Tokens/sec:2.7
Example#:2      Prompt-len:10   New-tokens-generated:11 Total-time:4.758s       Prefill-phase:1007.468ms        Time/token:372ms        Tokens/sec:2.7
Example#:3      Prompt-len:8    New-tokens-generated:11 Total-time:4.281s       Prefill-phase:594.566ms Time/token:365ms        Tokens/sec:2.7
Example#:4      Prompt-len:8    New-tokens-generated:11 Total-time:4.316s       Prefill-phase:586.533ms Time/token:370ms        Tokens/sec:2.7
Example#:5      Prompt-len:6    New-tokens-generated:11 Total-time:4.225s       Prefill-phase:554.463ms Time/token:364ms        Tokens/sec:2.8
Example#:6      Prompt-len:5    New-tokens-generated:11 Total-time:4.174s       Prefill-phase:542.424ms Time/token:360ms        Tokens/sec:2.8
Example#:7      Prompt-len:9    New-tokens-generated:11 Total-time:4.751s       Prefill-phase:994.609ms Time/token:373ms        Tokens/sec:2.7
Example#:8      Prompt-len:8    New-tokens-generated:11 Total-time:4.326s       Prefill-phase:600.926ms Time/token:369ms        Tokens/sec:2.7
Example#:9      Prompt-len:9    New-tokens-generated:11 Total-time:4.647s       Prefill-phase:997.833ms Time/token:362ms        Tokens/sec:2.8
Example#:10     Prompt-len:7    New-tokens-generated:11 Total-time:4.201s       Prefill-phase:563.480ms Time/token:360ms        Tokens/sec:2.8

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>

This example is just to see if we've set up everything correctly and we can also get the inference times (Time/token). Next we will extend the token limit to give a longer response for our general task benchmark questions.

Let's change the prompts by modifying model_utils.py and change the following prompt:

prompts = [ "What is the meaning of life?",             
            "Tell me something you don't know.",        
            "What does Xilinx do?",                     
            "What is the mass of earth?",                
            "What is a poem?",                          
            "What is recursion?",                        
            "Tell me a one line joke.",                  
            "Who is Gilgamesh?",                         
            "Tell me something about cryptocurrency.",  
            "How did it all begin?"                     
            ]

To the following:

prompts = [ "Describe your favorite hobby or activity.",             
            "Explain how to make a grilled cheese sandwich.",        
            "Discuss the advantages and disadvantages of social media.",                     
            "Tell a short story about a lost puppy finding its way home.",                
            "Describe the process of photosynthesis in plants.",                          
            "Compare and contrast the differences between cats and dogs.",                        
            "Explain why recycling is important for the environment.",                  
            "Discuss the impact of technology on modern education.",                         
            "Imagine you have a time machine. Where and when would you go, and why?",  
            "Describe your ideal vacation destination."                     
            ]

Benchmarking the model's capabilities for general task: These prompts cover a range of topics and styles to evaluate different aspects of a language model's capabilities, from factual knowledge and descriptive writing to storytelling and argumentation.

Now we need to change the token limit on run_awq.py. For now let us set it to 300. On line 173, set the max_new_tokens=300

Result:

prompt: Describe your favorite hobby or activity.
response: Describe your favorite hobby or activity.
My favorite hobby is playing the guitar. I have been playing for about 10 years now, and I find it to be the most relaxing and enjoyable activity. I love the feeling of creating music and expressing myself through the instrument.

I find that playing the guitar allows me to be creative and productive, while also providing a sense of calm and focus. It's a great way to unwind and de-stress after a long day, and I often find myself getting lost in the music and forgetting about everything else.

One of the things I enjoy most about playing the guitar is the variety of styles and genres that I can play. From classical to rock to blues, there are so many different ways to play the guitar, and I love experimenting with different techniques and styles. It keeps my playing interesting and challenging, and I'm always learning new things.

Another aspect of playing the guitar that I enjoy is the sense of community and connection that comes with it. There are so many other guitar players out there, and it's great to be able to connect with them through music. I've made many friends through my guitar playing, and it's a great way to meet like-minded people and share in a common passion.

Overall, playing the guitar is my favorite hobby because it provides me with a sense of creativity, relaxation,
****************************************
prompt: Explain how to make a grilled cheese sandwich.
response: Explain how to make a grilled cheese sandwich.

Grilled cheese sandwiches are a classic comfort food that can be made in a variety of ways. Here is a basic recipe for making a grilled cheese sandwich:

Ingredients:

* 2 slices of bread (white or whole wheat)
* 1-2 slices of cheese (such as cheddar, American, or mozzarella)
* 1 tablespoon of butter or non-stick cooking spray
* Salt and pepper to taste

Instructions:

1. Start by preheating a non-stick skillet or griddle over medium heat.
2. Butter or spray one side of each slice of bread.
3. Place one slice of bread, butter side down, in the skillet.
4. Place one or two slices of cheese on top of the bread in the skillet.
5. Place the other slice of bread, butter side up, on top of the cheese.
6. Cook for 2-3 minutes or until the bread is golden brown and the cheese is melted.
7. Carefully flip the sandwich over and cook for an additional 2-3 minutes or until the other side is also golden brown.
8. Remove the sandwich from the skillet and let it cool for a minute
****************************************
prompt: Discuss the advantages and disadvantages of social media.
response: Discuss the advantages and disadvantages of social media.
Social media has become an integral part of modern society, offering numerous benefits and drawbacks. On the one hand, social media platforms provide users with a range of opportunities for communication, self-expression, and networking. On the other hand, these platforms also raise concerns about privacy, security, and the spread of misinformation. In this essay, we will discuss the advantages and disadvantages of social media, highlighting their impact on society and individual users.
Advantages of Social Media:
1. Communication and Networking: Social media platforms provide users with an opportunity to connect with people from all over the world, fostering communication and networking. Users can share ideas, collaborate on projects, and build professional relationships.
2. Self-Expression: Social media platforms offer users a platform for self-expression, allowing them to share their thoughts, feelings, and experiences with others. This can help individuals build their personal brand and establish themselves as experts in their field.
3. Access to Information: Social media platforms provide users with access to a vast amount of information on various topics. Users can access news, research, and educational content from around the world, expanding their knowledge and perspectives.
4. Community Building: Social media platforms enable users to connect with people who share similar interests and passions, creating online communities around shared identities and causes.
5. Marketing and Advertising: Social media platforms offer
****************************************
prompt: Tell a short story about a lost puppy finding its way home.
response: Tell a short story about a lost puppy finding its way home.

Once upon a time, in a small town surrounded by a dense forest, there lived a little puppy named Max. Max was a playful and curious pup who loved to explore the world around him. One day, while on a walk with his owner, Max became distracted by a squirrel and ran off in pursuit.

As the sun began to set, Max realized he was lost. He sniffed the air, trying to pick up the scent of his home, but it was nowhere to be found. The forest was dark and eerie, and Max felt a lump in his throat as he realized he might never see his cozy bed or his owner's warm embrace again.

But Max was a determined pup, and he refused to give up. He set off in a direction he hoped was home, his little paws padding softly on the forest floor. As he walked, he called out for his owner, but only the wind answered.

Just when Max was starting to lose hope, he caught a whiff of something familiar. It was the scent of his owner's perfume, mixed with the smell of the forest. Following the scent, Max's heart raced as he turned a corner and saw a glimmer of light in the distance.

As he approached, he saw the familiar shape of his home, nestled among the trees. Max
****************************************
prompt: Describe the process of photosynthesis in plants.
response: Describe the process of photosynthesis in plants.
Photosynthesis is the process by which plants, algae, and some bacteria convert light energy from the sun into chemical energy in the form of organic compounds, such as glucose. This process occurs in specialized organelles called chloroplasts, which are present in plant cells.
The process of photosynthesis can be divided into two stages: the light-dependent reactions and the light-independent reactions.
Light-Dependent Reactions:
In the light-dependent reactions, light energy is absorbed by pigments such as chlorophyll and transferred to a molecule called ATP (adenosine triphosphate). This energy is then used to convert water into oxygen and protons (H+), which are used to generate ATP. The light-dependent reactions also produce NADPH (nicotinamide adenine dinucleotide phosphate), a high-energy electron carrier that is used in the light-independent reactions.
Light-Independent Reactions:
In the light-independent reactions, the energy from ATP and NADPH is used to convert carbon dioxide into glucose. This process occurs in the stroma (the fluid-filled space within the chloroplast) and involves the enzyme rubisco (ribulose-1,5-bisphosph
****************************************
prompt: Compare and contrast the differences between cats and dogs.
response: Compare and contrast the differences between cats and dogs.

Cats and dogs are two of the most popular pets in the world, and they have many differences in their behavior, physical characteristics, and needs. Here are some of the main differences between cats and dogs:

1. Physical Characteristics: Cats are generally smaller and more agile than dogs. They have a slender body, shorter legs, and a longer tail. Dogs, on the other hand, are larger and more muscular, with longer legs and a shorter tail.
2. Behavior: Cats are more independent and aloof than dogs. They are not as social and may not be as eager to please their owners. Dogs, on the other hand, are highly social and are known for their loyalty and affection towards their owners.
3. Grooming: Cats are meticulous about their grooming and spend a significant amount of time each day licking and grooming themselves. Dogs, on the other hand, do not groom themselves as much and may require regular grooming from their owners.
4. Exercise: Cats are generally less energetic than dogs and do not require as much exercise. They are happy with short periods of play and exercise, while dogs need more regular and longer periods of exercise to stay healthy.
5. Training: Cats are more difficult to train than dogs and may not be as responsive to commands. D
****************************************
prompt: Explain why recycling is important for the environment.
response: Explain why recycling is important for the environment.
Recycling is important for the environment for several reasons:

1. Conservation of Natural Resources: Recycling helps to conserve natural resources by reducing the need for raw materials extraction. For example, recycling one ton of paper saves 17 trees, 7,000 gallons of water, and 4,100 kilowatt-hours of electricity.
2. Reduction of Greenhouse Gas Emissions: The production of new materials requires energy, which is typically generated by burning fossil fuels and releases greenhouse gases (GHGs) into the atmosphere. Recycling reduces the need for new raw materials production, which in turn reduces GHG emissions.
3. Sustainable Waste Management: Recycling helps to manage waste sustainably by reducing the amount of waste that ends up in landfills. When waste is recycled, it reduces the amount of space needed for landfills and helps to minimize the environmental impacts associated with landfills, such as methane production and leachate formation.
4. Conservation of Land and Water Resources: Recycling helps to conserve land and water resources by reducing the need for new land and water use in the production of raw materials. For example, recycling aluminum cans uses 95% less land and 90% less water than
****************************************
prompt: Discuss the impact of technology on modern education.
response: Discuss the impact of technology on modern education.
Technology has transformed every aspect of modern life, including education. The impact of technology on modern education is multifaceted and far-reaching, affecting how students learn, how teachers teach, and how educational institutions operate. Here are some of the key ways technology is changing education:
1. Online Learning Platforms: With the rise of online learning platforms, students can now access educational content from anywhere in the world. This has opened up new opportunities for students who may not have had access to quality education otherwise. Online learning platforms also provide students with personalized learning experiences tailored to their individual needs and learning styles.
2. Digital Resources: Technology has made it easier for teachers to access digital resources such as educational videos, interactive simulations, and virtual labs. These resources provide students with engaging and interactive learning experiences that can enhance their understanding of complex concepts.
3. Blended Learning: Blended learning combines traditional face-to-face instruction with online learning. This approach allows students to learn at their own pace and on their own schedule, while still benefiting from the social interaction and support of a traditional classroom setting.
4. Gamification: Gamification is the use of game design elements in non-game contexts, such as education. This approach aims to increase student engagement and motivation by making learning more fun and interactive. Gamification can take many forms, such as points, badges, and leader
****************************************
prompt: Imagine you have a time machine. Where and when would you go, and why?
response: Imagine you have a time machine. Where and when would you go, and why?

Time travel has long been a fascinating concept that has captured the imagination of people around the world. If I had the chance to travel through time, I would go back to the Renaissance period in Europe, specifically to Florence, Italy, during the 15th century.

The Renaissance was a time of great cultural, artistic, and scientific achievement, and Florence was at the heart of it all. I would love to witness firsthand the incredible works of art and architecture that were created during this period, including Michelangelo's David, Botticelli's The Birth of Venus, and Brunelleschi's Duomo.

Moreover, I would be eager to meet some of the most influential figures of the time, such as Leonardo da Vinci, Michelangelo, and Galileo Galilei. I would be fascinated to learn about their thoughts, ideas, and creative processes, and to see how they contributed to the cultural and intellectual landscape of the Renaissance.

But my reasons for visiting Florence during the Renaissance go beyond mere cultural curiosity. I am also drawn to the era's emphasis on humanism, which emphasized the potential of human beings to achieve great things through education, reason, and individualism. I believe that these values are just as relevant today, and I would love to learn more about how they were embodied in the daily
****************************************
prompt: Describe your ideal vacation destination.
response: Describe your ideal vacation destination.

Ideal vacation destination? Hmm, that's a tough one. I think I would have to say Bora Bora. I mean, where else can you find crystal-clear waters, overwater bungalows, and a picture-perfect island paradise? It's just the epitome of relaxation and luxury.

I would spend my days lounging on the beach, snorkeling in the turquoise lagoon, and indulging in gourmet meals. And at night, I would watch the sunset from the comfort of my bungalow, sipping on a cocktail and listening to the soothing sounds of the ocean. It's just the ultimate getaway from the hustle and bustle of everyday life.

But what really seals the deal for me is the sense of exclusivity and luxury that Bora Bora offers. With only a handful of resorts on the entire island, you really feel like you're in a private paradise. And the service? Absolutely top-notch. From the moment you arrive, you're treated like royalty, with personalized attention and attentive service.

Of course, there are plenty of other great vacation destinations out there, but for sheer luxury and relaxation, Bora Bora is hard to
Number of prompts found in log: 10
Example#:1      Prompt-len:11   New-tokens-generated:300        Total-time:113.904s     Prefill-phase:1019.101ms        Time/token:375ms        Tokens/sec:2.7
Example#:2      Prompt-len:14   New-tokens-generated:300        Total-time:99.498s      Prefill-phase:1043.736ms        Time/token:327ms        Tokens/sec:3.1
Example#:3      Prompt-len:13   New-tokens-generated:300        Total-time:98.415s      Prefill-phase:1031.711ms        Time/token:323ms        Tokens/sec:3.1
Example#:4      Prompt-len:15   New-tokens-generated:300        Total-time:97.977s      Prefill-phase:1055.769ms        Time/token:322ms        Tokens/sec:3.1
Example#:5      Prompt-len:12   New-tokens-generated:300        Total-time:99.022s      Prefill-phase:1033.711ms        Time/token:325ms        Tokens/sec:3.1
Example#:6      Prompt-len:13   New-tokens-generated:300        Total-time:96.820s      Prefill-phase:1034.716ms        Time/token:318ms        Tokens/sec:3.1
Example#:7      Prompt-len:13   New-tokens-generated:300        Total-time:98.397s      Prefill-phase:1026.692ms        Time/token:323ms        Tokens/sec:3.1
Example#:8      Prompt-len:11   New-tokens-generated:300        Total-time:98.235s      Prefill-phase:999.623ms Time/token:323ms        Tokens/sec:3.1
Example#:9      Prompt-len:19   New-tokens-generated:300        Total-time:98.994s      Prefill-phase:1071.996ms        Time/token:325ms        Tokens/sec:3.1
Example#:10     Prompt-len:9    New-tokens-generated:300        Total-time:98.324s      Prefill-phase:977.564ms Time/token:323ms        Tokens/sec:3.1

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>

The model is able to answer all the prompt without an issue.

Now let's try something more complex and nuanced to test the model's capabilities as a Maker's assistant (a.k.a. HacksterMate):

Start by changing the prompt by modifying model_utils.py. I have prepared an easy and hard alternative:

HacksterMate easy:

prompts = [ "How do I solder two wires together safely?",             
            "What is the best material for 3D printing?",        
            "How do I choose the right resistor for an LED?",                     
            "What are some beginner Arduino projects?",                
            "How do I make a basic circuit with a breadboard?",                          
            "What are common CNC machine materials?",                        
            "How do I calibrate a 3D printer?",                  
            "What tools are essential for a maker's workspace?",                         
            "What is the difference between AC and DC current?",  
            "How can I make a simple robot?"                     
            ]

HacksterMate hard:

prompts = [ "What are the pros and cons of using a CNC mill vs. a CNC router for metalwork?",             
            "How do I design and implement a PID controller for a homemade CNC machine?",        
            "What are the considerations for selecting the right microcontroller for an IoT project with multiple sensors?",                     
            "How can I optimize a 3D model for printing to reduce material use and printing time?",                
            "What are the best practices for designing PCBs for high-frequency applications?.",                          
            "How do I integrate a LiDAR sensor with an Arduino for a real-time mapping project?",                        
            "What are the benefits and challenges of using FPGAs in complex robotics projects?",                  
            "How do I implement a machine learning algorithm on an edge device for predictive maintenance?",                         
            "What are the key factors in selecting the right type of motor for a precision robotic arm?",  
            "How can I create a secure communication protocol for a DIY home automation system?"                     
            ]

Don't forget to change the maximum token on run_awq.py. For our testing I set the max_new_tokens=300.

Result log HacksterMate easy:

Microsoft Windows [Version 10.0.22631.3880]
(c) Microsoft Corporation. All rights reserved.

C:\Users\s4mue>conda env list
# conda environments:
#
                         C:\ProgramData\miniconda3
ryzenai-1.1-20240630-140154     C:\Users\s4mue\Miniconda3\envs\ryzenai-1.1-20240630-140154
base                     C:\Users\s4mue\miniconda3
onnx-vai                 C:\Users\s4mue\miniconda3\envs\onnx-vai
robot                    C:\Users\s4mue\miniconda3\envs\robot
ryzenai-1.1-20240630-140154     C:\Users\s4mue\miniconda3\envs\ryzenai-1.1-20240630-140154
ryzenai-transformers     C:\Users\s4mue\miniconda3\envs\ryzenai-transformers


C:\Users\s4mue>conda activate ryzenai-transformers

C:\Users\s4mue>SET DISTUTILS_USE_SDK=1

C:\Users\s4mue>SET MSSdk=1

C:\Users\s4mue>SET "VS_VERSION=16.0"

C:\Users\s4mue>SET "VS_MAJOR=16"

C:\Users\s4mue>SET "VS_YEAR=2019"

C:\Users\s4mue>set "MSYS2_ARG_CONV_EXCL=/AI;/AL;/OUT;/out"

C:\Users\s4mue>set "MSYS2_ENV_CONV_EXCL=CL"

C:\Users\s4mue>set "PY_VCRUNTIME_REDIST=\bin\vcruntime140.dll"

C:\Users\s4mue>set "CXX=cl.exe"

C:\Users\s4mue>set "CC=cl.exe"

C:\Users\s4mue>set "VSINSTALLDIR="

C:\Users\s4mue>set "NEWER_VS_WITH_OLDER_VC=0"

C:\Users\s4mue>for /F "usebackq tokens=*" %i in (`vswhere.exe -nologo -products * -version [16.0,17.0) -property installationPath`) do (set "VSINSTALLDIR=%i\" )

C:\Users\s4mue>(set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (for /F "usebackq tokens=*" %i in (`vswhere.exe -nologo -products * -requires Microsoft.VisualStudio.ComponentGroup.VC.Tools.142.x86.x64 -property installationPath`) do (
set "VSINSTALLDIR=%i\"
 set "NEWER_VS_WITH_OLDER_VC=1"
) )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (for /F "usebackq tokens=*" %i in (`vswhere.exe -nologo -products * -requires Microsoft.VisualStudio.Component.VC.v142.x86.x64 -property installationPath`) do (
set "VSINSTALLDIR=%i\"
 set "NEWER_VS_WITH_OLDER_VC=1"
) )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\" )

C:\Users\s4mue>IF NOT "" == "" (
set "INCLUDE=;"
 set "LIB=;"
 set "CMAKE_PREFIX_PATH=;"
)

C:\Users\s4mue>call :GetWin10SdkDir

C:\Users\s4mue>call :GetWin10SdkDirHelper HKLM\SOFTWARE\Wow6432Node  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 call :GetWin10SdkDirHelper HKCU\SOFTWARE\Wow6432Node  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 call :GetWin10SdkDirHelper HKLM\SOFTWARE  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 call :GetWin10SdkDirHelper HKCU\SOFTWARE  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 exit /B 1

C:\Users\s4mue>exit /B 0

C:\Users\s4mue>for /F %i in ('dir /ON /B "C:\Program Files (x86)\Windows Kits\10\\include\10.*"') DO (SET WindowsSDKVer=%~i )

C:\Users\s4mue>(SET WindowsSDKVer=10.0.19041.0 )

C:\Users\s4mue>if errorlevel 1 (echo "Didn't find any windows 10 SDK. I'm not sure if things will work, but let's try..." )  else (echo Windows SDK version found as: "10.0.19041.0" )
Windows SDK version found as: "10.0.19041.0"

C:\Users\s4mue>set "CMAKE_PLAT=x64"

C:\Users\s4mue>set "VCVARSBAT=64"

C:\Users\s4mue>set "CMAKE_ARGS=-DCMAKE_BUILD_TYPE=Release"

C:\Users\s4mue>IF "" == "1" (set "CMAKE_ARGS=-DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX= -DCMAKE_PROGRAM_PATH=\bin;\Scripts;\Library\bin;\bin;\Scripts;\Library\bin" )

C:\Users\s4mue>IF NOT "win-64" == "win-64" (
set "CONDA_BUILD_CROSS_COMPILATION=1"
 set "CMAKE_ARGS=-DCMAKE_BUILD_TYPE=Release -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_SYSTEM_PROCESSOR=AMD64"
)  else (set "CONDA_BUILD_CROSS_COMPILATION=0" )

C:\Users\s4mue>IF 2019 GEQ 2019 (
set "CMAKE_GEN=Visual Studio 16 2019"
 set "USE_NEW_CMAKE_GEN_SYNTAX=1"
)  ELSE (
IF "win-64" == "win-64" (set "CMAKE_GEN=Visual Studio 16 2019 Win64" )  else (set "CMAKE_GEN=Visual Studio 16 2019" )
 set "USE_NEW_CMAKE_GEN_SYNTAX=0"
)

C:\Users\s4mue>echo "NEWER_VS_WITH_OLDER_VC=0"
"NEWER_VS_WITH_OLDER_VC=0"

C:\Users\s4mue>if "0" == "1" (set /p NEWER_VS= 0<"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\VC\Auxiliary\Build\Microsoft.VCToolsVersion.default.txt" )

C:\Users\s4mue>type "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\VC\Auxiliary\Build\Microsoft.VCToolsVersion.default.txt"
14.29.30133

C:\Users\s4mue>dir "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\VC\Redist\MSVC\"
 Volume in drive C is Windows
 Volume Serial Number is E86E-0A5C

 Directory of C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Redist\MSVC

30/06/2024  13:51    <DIR>          .
30/06/2024  13:50    <DIR>          ..
30/06/2024  13:51    <DIR>          14.29.30133
30/06/2024  13:51    <DIR>          v142
               0 File(s)              0 bytes
               4 Dir(s)  317.572.493.312 bytes free

C:\Users\s4mue>if "0" == "1" (
echo ""
 if "~0,4" == "14.2" (set "CMAKE_GEN=Visual Studio 16 2019" )  else (set "CMAKE_GEN=Visual Studio 17 2022" )
 set "USE_NEW_CMAKE_GEN_SYNTAX=1"
)

C:\Users\s4mue>IF "" == "" SET "CMAKE_GENERATOR=Visual Studio 16 2019"

C:\Users\s4mue>IF "1" == "1" (
IF "" == "" SET "CMAKE_GENERATOR_PLATFORM=x64"
 IF "" == "" SET "CMAKE_GENERATOR_TOOLSET=v142"
)

C:\Users\s4mue>pushd C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\

C:\Program Files (x86)\Microsoft Visual Studio\2019\Community>CALL "VC\Auxiliary\Build\vcvars64.bat" -vcvars_ver=14.29 10.0.19041.0
**********************************************************************
** Visual Studio 2019 Developer Command Prompt v16.11.37
** Copyright (c) 2021 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'

(ryzenai-transformers) C:\Users\s4mue>cd Documents\working-folder\RyzenAI-SW\example\transformers\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>setup.bat

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PWD=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET THIRD_PARTY=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET TVM_LIBRARY_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PATH=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\\Extensions\Microsoft\IntelliCode\CLI;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Current\bin\Roslyn;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Team Tools\Performance Tools\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\\x64;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\devinit;C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\MSBuild\Current\Bin;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\mingw64\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\mingw-w64\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\usr\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Scripts;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\bin;C:\Users\s4mue\miniconda3\condabin;C:\Windows\System32\AMD;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Users\s4mue\AppData\Local\Microsoft\WindowsApps;C:\Program Files\CMake\bin;C:\Users\s4mue\AppData\Local\Programs\Python\Python311\Scripts;C:\Users\s4mue\AppData\Local\Programs\Python\Python311;C:\Users\s4mue\AppData\Local\Microsoft\WindowsApps;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Scripts;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers;C:\Users\s4mue\miniconda3;C:\Program Files\Git\bin;C:\Users\s4mue\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\voe-4.0-win_amd64;C:\Users\s4mue\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\onnxruntime\bin;.;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\cpp\;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTORCH_AIE_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\utils

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\utils;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\kernels

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET AWQ_CACHE=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\awq_cache\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set XRT_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\xrt-ipu

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set TARGET_DESIGN=

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set DEVICE=phx

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set XLNX_VART_FIRMWARE=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\/xclbin/phx

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>cd models\llama2

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>python run_awq.py --w_bit 4 --task quantize
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
Namespace(dataset='raw', w_bit=4, awq='load', target='cpu', task='quantize', flash_attention=False, lm_head=False, num_torch_threads=8)
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:13<00:00,  6.96s/it]
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 12916.516MB


Loading pre-computed AWQ results from C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\awq_cache\
Quantization config: {'zero_point': True, 'q_group_size': 128}
real weight quantization...: 100%|█████████████████████████████████████████████████████| 32/32 [05:48<00:00, 10.90s/it]

**** Model size: 6965.766MB


Model transformation: Replacing <class 'qmodule.WQLinear'> layers with <class 'qlinear.QLinearPerGrp'> ...
Model transformation done!: Replaced 224 <class 'qmodule.WQLinear'> layers with <class 'qlinear.QLinearPerGrp'>.
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 564.516MB


Quantized and saved model: pytorch_llama27b_w_bit_4_awq_amd.pt

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>python run_awq.py --task decode --target aie --w_bit 4
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
Namespace(dataset='raw', w_bit=4, awq='load', target='aie', task='decode', flash_attention=False, lm_head=False, num_torch_threads=8)
Loading from ckpt: pytorch_llama27b_w_bit_4_awq_amd.pt

**** Model size: 564.516MB


LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)
Preparing weights of layer : model.layers.0.self_attn.q_proj
Preparing weights of layer : model.layers.0.self_attn.k_proj
Preparing weights of layer : model.layers.0.self_attn.v_proj
Preparing weights of layer : model.layers.0.self_attn.o_proj
Preparing weights of layer : model.layers.0.mlp.gate_proj
Preparing weights of layer : model.layers.0.mlp.up_proj
Preparing weights of layer : model.layers.0.mlp.down_proj
Preparing weights of layer : model.layers.1.self_attn.q_proj
Preparing weights of layer : model.layers.1.self_attn.k_proj
Preparing weights of layer : model.layers.1.self_attn.v_proj
Preparing weights of layer : model.layers.1.self_attn.o_proj
Preparing weights of layer : model.layers.1.mlp.gate_proj
Preparing weights of layer : model.layers.1.mlp.up_proj
Preparing weights of layer : model.layers.1.mlp.down_proj
Preparing weights of layer : model.layers.2.self_attn.q_proj
Preparing weights of layer : model.layers.2.self_attn.k_proj
Preparing weights of layer : model.layers.2.self_attn.v_proj
Preparing weights of layer : model.layers.2.self_attn.o_proj
Preparing weights of layer : model.layers.2.mlp.gate_proj
Preparing weights of layer : model.layers.2.mlp.up_proj
Preparing weights of layer : model.layers.2.mlp.down_proj
Preparing weights of layer : model.layers.3.self_attn.q_proj
Preparing weights of layer : model.layers.3.self_attn.k_proj
Preparing weights of layer : model.layers.3.self_attn.v_proj
Preparing weights of layer : model.layers.3.self_attn.o_proj
Preparing weights of layer : model.layers.3.mlp.gate_proj
Preparing weights of layer : model.layers.3.mlp.up_proj
Preparing weights of layer : model.layers.3.mlp.down_proj
Preparing weights of layer : model.layers.4.self_attn.q_proj
Preparing weights of layer : model.layers.4.self_attn.k_proj
Preparing weights of layer : model.layers.4.self_attn.v_proj
Preparing weights of layer : model.layers.4.self_attn.o_proj
Preparing weights of layer : model.layers.4.mlp.gate_proj
Preparing weights of layer : model.layers.4.mlp.up_proj
Preparing weights of layer : model.layers.4.mlp.down_proj
Preparing weights of layer : model.layers.5.self_attn.q_proj
Preparing weights of layer : model.layers.5.self_attn.k_proj
Preparing weights of layer : model.layers.5.self_attn.v_proj
Preparing weights of layer : model.layers.5.self_attn.o_proj
Preparing weights of layer : model.layers.5.mlp.gate_proj
Preparing weights of layer : model.layers.5.mlp.up_proj
Preparing weights of layer : model.layers.5.mlp.down_proj
Preparing weights of layer : model.layers.6.self_attn.q_proj
Preparing weights of layer : model.layers.6.self_attn.k_proj
Preparing weights of layer : model.layers.6.self_attn.v_proj
Preparing weights of layer : model.layers.6.self_attn.o_proj
Preparing weights of layer : model.layers.6.mlp.gate_proj
Preparing weights of layer : model.layers.6.mlp.up_proj
Preparing weights of layer : model.layers.6.mlp.down_proj
Preparing weights of layer : model.layers.7.self_attn.q_proj
Preparing weights of layer : model.layers.7.self_attn.k_proj
Preparing weights of layer : model.layers.7.self_attn.v_proj
Preparing weights of layer : model.layers.7.self_attn.o_proj
Preparing weights of layer : model.layers.7.mlp.gate_proj
Preparing weights of layer : model.layers.7.mlp.up_proj
Preparing weights of layer : model.layers.7.mlp.down_proj
Preparing weights of layer : model.layers.8.self_attn.q_proj
Preparing weights of layer : model.layers.8.self_attn.k_proj
Preparing weights of layer : model.layers.8.self_attn.v_proj
Preparing weights of layer : model.layers.8.self_attn.o_proj
Preparing weights of layer : model.layers.8.mlp.gate_proj
Preparing weights of layer : model.layers.8.mlp.up_proj
Preparing weights of layer : model.layers.8.mlp.down_proj
Preparing weights of layer : model.layers.9.self_attn.q_proj
Preparing weights of layer : model.layers.9.self_attn.k_proj
Preparing weights of layer : model.layers.9.self_attn.v_proj
Preparing weights of layer : model.layers.9.self_attn.o_proj
Preparing weights of layer : model.layers.9.mlp.gate_proj
Preparing weights of layer : model.layers.9.mlp.up_proj
Preparing weights of layer : model.layers.9.mlp.down_proj
Preparing weights of layer : model.layers.10.self_attn.q_proj
Preparing weights of layer : model.layers.10.self_attn.k_proj
Preparing weights of layer : model.layers.10.self_attn.v_proj
Preparing weights of layer : model.layers.10.self_attn.o_proj
Preparing weights of layer : model.layers.10.mlp.gate_proj
Preparing weights of layer : model.layers.10.mlp.up_proj
Preparing weights of layer : model.layers.10.mlp.down_proj
Preparing weights of layer : model.layers.11.self_attn.q_proj
Preparing weights of layer : model.layers.11.self_attn.k_proj
Preparing weights of layer : model.layers.11.self_attn.v_proj
Preparing weights of layer : model.layers.11.self_attn.o_proj
Preparing weights of layer : model.layers.11.mlp.gate_proj
Preparing weights of layer : model.layers.11.mlp.up_proj
Preparing weights of layer : model.layers.11.mlp.down_proj
Preparing weights of layer : model.layers.12.self_attn.q_proj
Preparing weights of layer : model.layers.12.self_attn.k_proj
Preparing weights of layer : model.layers.12.self_attn.v_proj
Preparing weights of layer : model.layers.12.self_attn.o_proj
Preparing weights of layer : model.layers.12.mlp.gate_proj
Preparing weights of layer : model.layers.12.mlp.up_proj
Preparing weights of layer : model.layers.12.mlp.down_proj
Preparing weights of layer : model.layers.13.self_attn.q_proj
Preparing weights of layer : model.layers.13.self_attn.k_proj
Preparing weights of layer : model.layers.13.self_attn.v_proj
Preparing weights of layer : model.layers.13.self_attn.o_proj
Preparing weights of layer : model.layers.13.mlp.gate_proj
Preparing weights of layer : model.layers.13.mlp.up_proj
Preparing weights of layer : model.layers.13.mlp.down_proj
Preparing weights of layer : model.layers.14.self_attn.q_proj
Preparing weights of layer : model.layers.14.self_attn.k_proj
Preparing weights of layer : model.layers.14.self_attn.v_proj
Preparing weights of layer : model.layers.14.self_attn.o_proj
Preparing weights of layer : model.layers.14.mlp.gate_proj
Preparing weights of layer : model.layers.14.mlp.up_proj
Preparing weights of layer : model.layers.14.mlp.down_proj
Preparing weights of layer : model.layers.15.self_attn.q_proj
Preparing weights of layer : model.layers.15.self_attn.k_proj
Preparing weights of layer : model.layers.15.self_attn.v_proj
Preparing weights of layer : model.layers.15.self_attn.o_proj
Preparing weights of layer : model.layers.15.mlp.gate_proj
Preparing weights of layer : model.layers.15.mlp.up_proj
Preparing weights of layer : model.layers.15.mlp.down_proj
Preparing weights of layer : model.layers.16.self_attn.q_proj
Preparing weights of layer : model.layers.16.self_attn.k_proj
Preparing weights of layer : model.layers.16.self_attn.v_proj
Preparing weights of layer : model.layers.16.self_attn.o_proj
Preparing weights of layer : model.layers.16.mlp.gate_proj
Preparing weights of layer : model.layers.16.mlp.up_proj
Preparing weights of layer : model.layers.16.mlp.down_proj
Preparing weights of layer : model.layers.17.self_attn.q_proj
Preparing weights of layer : model.layers.17.self_attn.k_proj
Preparing weights of layer : model.layers.17.self_attn.v_proj
Preparing weights of layer : model.layers.17.self_attn.o_proj
Preparing weights of layer : model.layers.17.mlp.gate_proj
Preparing weights of layer : model.layers.17.mlp.up_proj
Preparing weights of layer : model.layers.17.mlp.down_proj
Preparing weights of layer : model.layers.18.self_attn.q_proj
Preparing weights of layer : model.layers.18.self_attn.k_proj
Preparing weights of layer : model.layers.18.self_attn.v_proj
Preparing weights of layer : model.layers.18.self_attn.o_proj
Preparing weights of layer : model.layers.18.mlp.gate_proj
Preparing weights of layer : model.layers.18.mlp.up_proj
Preparing weights of layer : model.layers.18.mlp.down_proj
Preparing weights of layer : model.layers.19.self_attn.q_proj
Preparing weights of layer : model.layers.19.self_attn.k_proj
Preparing weights of layer : model.layers.19.self_attn.v_proj
Preparing weights of layer : model.layers.19.self_attn.o_proj
Preparing weights of layer : model.layers.19.mlp.gate_proj
Preparing weights of layer : model.layers.19.mlp.up_proj
Preparing weights of layer : model.layers.19.mlp.down_proj
Preparing weights of layer : model.layers.20.self_attn.q_proj
Preparing weights of layer : model.layers.20.self_attn.k_proj
Preparing weights of layer : model.layers.20.self_attn.v_proj
Preparing weights of layer : model.layers.20.self_attn.o_proj
Preparing weights of layer : model.layers.20.mlp.gate_proj
Preparing weights of layer : model.layers.20.mlp.up_proj
Preparing weights of layer : model.layers.20.mlp.down_proj
Preparing weights of layer : model.layers.21.self_attn.q_proj
Preparing weights of layer : model.layers.21.self_attn.k_proj
Preparing weights of layer : model.layers.21.self_attn.v_proj
Preparing weights of layer : model.layers.21.self_attn.o_proj
Preparing weights of layer : model.layers.21.mlp.gate_proj
Preparing weights of layer : model.layers.21.mlp.up_proj
Preparing weights of layer : model.layers.21.mlp.down_proj
Preparing weights of layer : model.layers.22.self_attn.q_proj
Preparing weights of layer : model.layers.22.self_attn.k_proj
Preparing weights of layer : model.layers.22.self_attn.v_proj
Preparing weights of layer : model.layers.22.self_attn.o_proj
Preparing weights of layer : model.layers.22.mlp.gate_proj
Preparing weights of layer : model.layers.22.mlp.up_proj
Preparing weights of layer : model.layers.22.mlp.down_proj
Preparing weights of layer : model.layers.23.self_attn.q_proj
Preparing weights of layer : model.layers.23.self_attn.k_proj
Preparing weights of layer : model.layers.23.self_attn.v_proj
Preparing weights of layer : model.layers.23.self_attn.o_proj
Preparing weights of layer : model.layers.23.mlp.gate_proj
Preparing weights of layer : model.layers.23.mlp.up_proj
Preparing weights of layer : model.layers.23.mlp.down_proj
Preparing weights of layer : model.layers.24.self_attn.q_proj
Preparing weights of layer : model.layers.24.self_attn.k_proj
Preparing weights of layer : model.layers.24.self_attn.v_proj
Preparing weights of layer : model.layers.24.self_attn.o_proj
Preparing weights of layer : model.layers.24.mlp.gate_proj
Preparing weights of layer : model.layers.24.mlp.up_proj
Preparing weights of layer : model.layers.24.mlp.down_proj
Preparing weights of layer : model.layers.25.self_attn.q_proj
Preparing weights of layer : model.layers.25.self_attn.k_proj
Preparing weights of layer : model.layers.25.self_attn.v_proj
Preparing weights of layer : model.layers.25.self_attn.o_proj
Preparing weights of layer : model.layers.25.mlp.gate_proj
Preparing weights of layer : model.layers.25.mlp.up_proj
Preparing weights of layer : model.layers.25.mlp.down_proj
Preparing weights of layer : model.layers.26.self_attn.q_proj
Preparing weights of layer : model.layers.26.self_attn.k_proj
Preparing weights of layer : model.layers.26.self_attn.v_proj
Preparing weights of layer : model.layers.26.self_attn.o_proj
Preparing weights of layer : model.layers.26.mlp.gate_proj
Preparing weights of layer : model.layers.26.mlp.up_proj
Preparing weights of layer : model.layers.26.mlp.down_proj
Preparing weights of layer : model.layers.27.self_attn.q_proj
Preparing weights of layer : model.layers.27.self_attn.k_proj
Preparing weights of layer : model.layers.27.self_attn.v_proj
Preparing weights of layer : model.layers.27.self_attn.o_proj
Preparing weights of layer : model.layers.27.mlp.gate_proj
Preparing weights of layer : model.layers.27.mlp.up_proj
Preparing weights of layer : model.layers.27.mlp.down_proj
Preparing weights of layer : model.layers.28.self_attn.q_proj
Preparing weights of layer : model.layers.28.self_attn.k_proj
Preparing weights of layer : model.layers.28.self_attn.v_proj
Preparing weights of layer : model.layers.28.self_attn.o_proj
Preparing weights of layer : model.layers.28.mlp.gate_proj
Preparing weights of layer : model.layers.28.mlp.up_proj
Preparing weights of layer : model.layers.28.mlp.down_proj
Preparing weights of layer : model.layers.29.self_attn.q_proj
Preparing weights of layer : model.layers.29.self_attn.k_proj
Preparing weights of layer : model.layers.29.self_attn.v_proj
Preparing weights of layer : model.layers.29.self_attn.o_proj
Preparing weights of layer : model.layers.29.mlp.gate_proj
Preparing weights of layer : model.layers.29.mlp.up_proj
Preparing weights of layer : model.layers.29.mlp.down_proj
Preparing weights of layer : model.layers.30.self_attn.q_proj
Preparing weights of layer : model.layers.30.self_attn.k_proj
Preparing weights of layer : model.layers.30.self_attn.v_proj
Preparing weights of layer : model.layers.30.self_attn.o_proj
Preparing weights of layer : model.layers.30.mlp.gate_proj
Preparing weights of layer : model.layers.30.mlp.up_proj
Preparing weights of layer : model.layers.30.mlp.down_proj
Preparing weights of layer : model.layers.31.self_attn.q_proj
Preparing weights of layer : model.layers.31.self_attn.k_proj
Preparing weights of layer : model.layers.31.self_attn.v_proj
Preparing weights of layer : model.layers.31.self_attn.o_proj
Preparing weights of layer : model.layers.31.mlp.gate_proj
Preparing weights of layer : model.layers.31.mlp.up_proj
Preparing weights of layer : model.layers.31.mlp.down_proj
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:None, device:aie, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:None, device:aie, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 564.512MB


Warming up ...
Warm up DONE!!
****************************************
prompt: How do I solder two wires together safely?
response: How do I solder two wires together safely?

I am trying to solder two wires together, but I am worried about the risk of burning myself or causing a short circuit. What are some safety precautions I can take to ensure a safe and successful soldering experience?

Soldering is a delicate process that requires attention to detail and proper safety precautions to avoid accidents. Here are some safety tips to keep in mind when soldering two wires together:

1. Use the right tools: Make sure you have the right tools for the job, including a soldering iron, solder, and a pair of needle-nose pliers. Choose an iron with a good tip and a comfortable grip.
2. Ground yourself: Static electricity can cause a short circuit when soldering. To prevent this, touch a grounded metal object, such as a metal water pipe, to discharge any static electricity from your body.
3. Use a soldering mat: A soldering mat provides a safe and stable surface for soldering. It helps to dissipate heat and prevents the iron from moving around, which can cause accidents.
4. Keep the work area clean: Keep the work area clean and clutter-free to prevent accidents. Make sure the iron and solder are within easy reach, and keep any other objects out of the way.
5. Use the right solder: Choose a solder with a low melting point for easy flow and good wetting properties. Avoid using solder that is too thick or too thin, as it can be difficult to control.
6. Apply solder carefully: Apply the solder carefully, using a small amount at a time. Hold the iron at the correct angle (usually around 45 degrees) and apply the solder to the end of the wire.
7. Use pliers to help: Use needle-nose pliers to help apply the solder to the wire. This will help you get a good connection and avoid overheating the wire.
8. Check for shorts: After soldering, check the wires for any shorts or breaks. Use a multimeter to test for continuity and make sure the connection is good.
9. Be patient: Soldering can be a slow process, so be patient and take your time. Don't rush the process, as this can lead to accidents or poor connections.
10. Follow safety guidelines: Always follow safety guidelines when working with electrical components. Make sure you have a good understanding of the components you are working with and the risks involved.

By following these safety tips, you can ensure a safe and successful soldering experience. Remember to always be patient and take your time when working with delicate components
****************************************
prompt: What is the best material for 3D printing?
response: What is the best material for 3D printing?

There are several materials that can be used for 3D printing, each with its own unique properties and applications. Some of the most common materials used in 3D printing include:

1. ABS (Acrylonitrile Butadiene Styrene): ABS is a popular material for 3D printing due to its high strength, durability, and resistance to heat and chemicals. It is also relatively inexpensive and easy to print with.
2. PLA (Polylactic Acid): PLA is a biodegradable and renewable material that is often used for prototyping and product design. It is less strong than ABS but has a lower melting point, making it easier to print with.
3. PETG (Polyethylene Terephthalate Glycol): PETG is a strong and flexible material that is often used for creating prototypes, models, and functional parts. It has a high melting point and is resistant to heat and chemicals.
4. Nylon: Nylon is a strong and flexible material that is often used for creating parts that require high strength and durability. It is also resistant to heat and chemicals.
5. Wood-based materials: There are several wood-based materials available for 3D printing, including bamboo, cork, and wood-based filaments. These materials offer a more sustainable and eco-friendly option for 3D printing.
6. Metal-based materials: Metal-based materials, such as aluminum and stainless steel, are used for creating parts that require high strength and durability. They are also resistant to heat and chemicals.
7. Ceramic-based materials: Ceramic-based materials, such as porcelain and stoneware, are used for creating parts that require high strength and durability. They are also resistant to heat and chemicals.
8. Carbon fiber-based materials: Carbon fiber-based materials are used for creating parts that require high strength and durability. They are also lightweight and have a high thermal conductivity.
9. Silicone-based materials: Silicone-based materials are used for creating parts that require flexibility and elasticity. They are also resistant to heat and chemicals.
10. Hybrid materials: Some 3D printing materials are hybrid, meaning they combine different materials to create a unique set of properties. For example, a hybrid material might combine the strength of metal with the flexibility of plastic.

The best material for 3D printing depends on the specific application and requirements of the project. For example, if you are creating a part that requires high strength and durability, you may want
****************************************
prompt: How do I choose the right resistor for an LED?
response: How do I choose the right resistor for an LED?

I'm trying to build an LED circuit, and I'm having trouble choosing the right resistor to use. The LED I'm using has a forward voltage drop of 1.75V and a forward current of 20mA. The resistor I'm considering is a 1kΩ resistor. Is this the right choice? How do I determine the correct resistor value for an LED circuit?

Answer:

The correct resistor value for an LED circuit depends on several factors, including the forward voltage drop of the LED, the desired brightness of the LED, and the total current draw of the circuit. In general, a resistor with a value that is greater than the forward voltage drop of the LED is needed to limit the current flowing through the LED to a safe level.

In your case, the forward voltage drop of the LED is 1.75V, and the desired current through the LED is 20mA. To calculate the correct resistor value, you can use the following formula:

R = (Vf - Vcc) / I

where R is the resistor value, Vf is the forward voltage drop of the LED (in this case, 1.75V), Vcc is the voltage supply of the circuit (usually 5V or 3.3V), and I is the desired current through the LED (in this case, 20mA).

Plugging in these values, we get:

R = (1.75V - 5V) / 0.02A = 1.45kΩ

So, in this case, a 1.45kΩ resistor would be a good choice for limiting the current through the LED to a safe level while also ensuring that the voltage drop across the resistor is minimal.

It's worth noting that the actual resistor value may need to be adjusted depending on other factors in the circuit, such as the total current draw and the desired brightness of the LED. In general, it's a good idea to use a resistor with a value that is slightly higher than the forward voltage drop of the LED to ensure that the LED is not overdriven and to minimize the risk of overheating.
****************************************
prompt: What are some beginner Arduino projects?
response: What are some beginner Arduino projects?
Arduino is an open-source electronics platform that allows users to create interactive electronic projects. Here are some beginner-friendly Arduino projects:

1. Simple LED Blinking: This is one of the most basic Arduino projects that involves connecting an LED to the Arduino board and writing a sketch to make it blink.
2. Temperature Sensor: This project involves using a temperature sensor (such as a DS18B20) to measure the temperature of the environment and display it on an LCD display.
3. Motion Detector: This project involves using a photodiode and a resistor to detect motion and trigger an LED to blink when motion is detected.
4. Robot Car: This project involves building a small robot car using an Arduino board and motor drivers, and programming it to move forward and backward using a joystick.
5. Weather Station: This project involves building a weather station using an Arduino board and various sensors (such as a temperature sensor, humidity sensor, and barometer) to measure the weather conditions and display them on an LCD display.
6. Wireless Remote Control: This project involves building a wireless remote control using an Arduino board and a wireless module (such as a NRF24L01) to control an LED or a motor from a distance.
7. Sound Detector: This project involves using a microphone and a resistor to detect sound waves and trigger an LED to blink when a certain sound is detected.
8. Line Following Robot: This project involves building a line following robot using an Arduino board and motor drivers, and programming it to follow a line using a joystick.
9. Home Automation System: This project involves building a home automation system using an Arduino board and various sensors (such as a light sensor, temperature sensor, and humidity sensor) to control the lighting, temperature, and humidity of a room.
10. Auto-Teller Machine: This project involves building an auto-teller machine using an Arduino board and a display screen, and programming it to dispense a specified amount of money when a certain amount is inserted.

These projects are great for beginners because they require minimal components and can be completed quickly and easily. They also provide a good introduction to the basics of Arduino programming and electronics.
****************************************
prompt: How do I make a basic circuit with a breadboard?
response: How do I make a basic circuit with a breadboard?

I'm new to electronics and I want to make a simple circuit with a breadboard. Can you give me a step-by-step guide on how to do it?

Here's what I want to make:

1. Connect a LED to the breadboard
2. Connect a resistor to the breadboard
3. Connect a battery to the breadboard
4. Connect the LED and resistor to the battery

Can you please provide me with a detailed guide on how to do this?

Answer:

Sure! Here's a step-by-step guide on how to make a basic circuit with a breadboard:

Step 1: Prepare the Breadboard

* Take a look at the breadboard and identify the different sections. Each section has a different number of holes. The holes are arranged in a grid pattern.
* Identify the row with the most holes (usually the top row). This row will be used for the positive terminal of your circuit.
* Identify the column with the most holes (usually the left column). This column will be used for the negative terminal of your circuit.
* Count the number of holes in each row and column. This will help you keep track of the number of components you need to solder to the breadboard.

Step 2: Solder the LED to the Breadboard

* Find the LED in your parts kit. It usually has three legs (or pins): one leg is positive, and the other two legs are negative.
* Locate the row with the most holes on the breadboard. This row will be used for the positive terminal of your circuit.
* Find a hole in the row that is farthest from the positive terminal. This hole will be used for the positive leg of the LED.
* Solder the positive leg of the LED to the hole in the row. Make sure the leg is facing towards the negative terminal of the breadboard.
* Repeat the same process for the other two legs of the LED. Solder them to the other two holes in the row, but make sure they are facing the opposite direction (i.e., one leg should be facing towards the positive terminal, and the other two legs should be facing towards the negative terminal).

Step 3: Solder the Resistor to the Breadboard

* Find the resistor in your parts kit. It usually has two or three legs (or pins): one leg is positive, and the other two legs are negative.
* Locate the column with the most holes on the breadboard. This column will be used for the negative terminal of your circuit.
* Find a hole in the column that is farthest from the negative terminal. This hole will be used for the positive leg of the
****************************************
prompt: What are common CNC machine materials?
response: What are common CNC machine materials?

CNC (Computer Numerical Control) machines are used in a variety of industries, including aerospace, automotive, medical, and more. The type of material used for a CNC machine depends on the specific application and industry. Here are some common materials used in CNC machines:

1. Aluminum: Aluminum is a popular material for CNC machines due to its lightweight, durability, and corrosion resistance. It is commonly used in aerospace, automotive, and medical applications.
2. Steel: Steel is another common material used in CNC machines, particularly in high-strength, high-temperature applications. It is used in industries such as automotive, aerospace, and heavy machinery.
3. Stainless Steel: Stainless steel is a corrosion-resistant material that is commonly used in food processing, pharmaceutical, and medical applications. It is also used in aerospace and automotive industries for its durability and resistance to corrosion.
4. Titanium: Titanium is a strong, lightweight metal that is used in aerospace and medical applications. It is highly corrosion-resistant and has a high strength-to-weight ratio, making it ideal for applications where weight is a concern.
5. Carbon Fiber: Carbon fiber is a lightweight, high-strength material that is used in aerospace, automotive, and sports equipment applications. It is particularly useful in situations where weight reduction is critical, such as in aircraft or race cars.
6. Plastics: Plastics are commonly used in CNC machines for their durability, affordability, and versatility. They are used in a variety of industries, including automotive, aerospace, and consumer goods.
7. Copper: Copper is a highly conductive material that is used in a variety of applications, including electronics, electrical, and heat transfer. It is also used in aerospace and medical applications for its durability and resistance to corrosion.
8. Brass: Brass is an alloy of copper and zinc that is used in a variety of applications, including plumbing, electrical, and hardware. It is also used in aerospace and automotive industries for its durability and resistance to corrosion.
9. Tungsten Carbide: Tungsten carbide is a hard, wear-resistant material that is used in cutting tools, wear parts, and other industrial applications. It is particularly useful in situations where high wear resistance is required, such as in mining and construction.
10. Composites: Composites are made up of a combination
****************************************
prompt: How do I calibrate a 3D printer?
response: How do I calibrate a 3D printer?
Calibration is an important process for any 3D printer, as it helps ensure that the printer is producing accurate and consistent prints. Here are the steps you can follow to calibrate a 3D printer:

1. Prepare the printer: Before you start the calibration process, make sure the printer is clean and free of any debris. Also, ensure that the printer is level, as an unlevel printer can produce inaccurate prints.
2. Use the leveling tool: Most 3D printers come with a leveling tool that allows you to adjust the height of the print bed. Use the tool to level the print bed, making sure it is perfectly flat.
3. Check the print bed alignment: Once the print bed is leveled, use the printer's built-in alignment tool to check the alignment of the print bed. This tool is usually a small arrow or marker that you can move around the print bed to check its alignment.
4. Adjust the print bed alignment: If the print bed is not aligned properly, use the alignment tool to adjust its position. Make sure the print bed is aligned with the nozzle of the printer.
5. Check the nozzle height: Use the printer's built-in height adjustment tool to adjust the height of the nozzle. This will ensure that the nozzle is at the correct height for printing.
6. Check the extruder alignment: Use the printer's built-in alignment tool to check the alignment of the extruder. This will ensure that the extruder is aligned properly with the print bed.
7. Check the belt tension: Use the printer's built-in tensioning tool to adjust the tension of the print belt. This will ensure that the print bed moves smoothly and consistently during the printing process.
8. Check the fan alignment: Use the printer's built-in alignment tool to check the alignment of the fan. This will ensure that the fan is aligned properly with the print bed and is blowing air consistently.
9. Check the print bed temperature: Use the printer's built-in temperature control tool to adjust the temperature of the print bed. This will ensure that the print bed is at the correct temperature for printing.
10. Print a test object: Once you have completed the calibration process, print a test object using the printer. This will help you confirm that the printer is producing accurate and consistent prints.

It's important to note that the calibration process may vary depending on the type of 3D printer you are using. Be sure to consult your printer's user manual for specific instructions on how to calibrate your printer.
****************************************
prompt: What tools are essential for a maker's workspace?
response: What tools are essential for a maker's workspace?
 Here are some of the most important tools that every maker should have in their workspace:

1. Safety equipment: This includes safety glasses, gloves, and a first aid kit. It's important to protect yourself from potential hazards when working with tools and materials.

2. Power tools: Every maker should have a variety of power tools, such as drills, saws, and sanders. These tools are essential for cutting, drilling, and shaping materials.

3. Hand tools: In addition to power tools, makers should also have a selection of hand tools, such as hammers, screwdrivers, and pliers. These tools are useful for small tasks and for working in tight spaces.

4. Measuring and layout tools: Measuring tapes, rulers, and squares are essential for accurately measuring and laying out materials.

5. Workbench and storage: A sturdy workbench and adequate storage are important for keeping your workspace organized and comfortable.

6. Electronics tools: Depending on the type of projects you work on, you may need specialized electronics tools, such as soldering irons, multimeters, and oscilloscopes.

7. 3D printing tools: If you work with 3D printing technology, you'll need specialized tools, such as 3D printers, software, and filament.

8. CNC tools: If you work with computer numerical control (CNC) technology, you'll need specialized tools, such as CNC machines, software, and cutters.

9. Laser cutting tools: Laser cutting technology can be useful for cutting and engraving materials. You'll need a laser cutter and software to use this technology.

10. Finishing tools: Depending on the materials you work with, you may need specialized finishing tools, such as sanders, grinders, and polishers.

By having these essential tools in your workspace, you'll be better equipped to tackle a wide range of projects and create high-quality results.
****************************************
prompt: What is the difference between AC and DC current?
response: What is the difference between AC and DC current?

In electrical systems, there are two types of current: alternating current (AC) and direct current (DC). The primary difference between these two types of current is the direction of flow and the way they move through a circuit.

Alternating current (AC) is an electric current that periodically reverses direction, flowing in one direction for a certain period of time and then reversing direction and flowing in the opposite direction for the same amount of time. AC current is the type of current used in household electrical outlets and is the type of current that most home appliances use.

Direct current (DC) is an electric current that flows in only one direction and does not change direction. DC current is commonly used in electronic devices such as batteries, motors, and power supplies.

Here are some key differences between AC and DC current:

1. Direction of flow: AC current flows in one direction and then reverses direction, while DC current flows in only one direction.
2. Waveform: AC current has a sinusoidal waveform, while DC current has a steady, linear waveform.
3. Voltage: AC voltage is the voltage that alternates between positive and negative values, while DC voltage is the voltage that remains constant.
4. Applications: AC current is commonly used in household electrical systems and in power distribution systems, while DC current is commonly used in electronic devices and in battery-powered systems.
5. Efficiency: AC current is generally more efficient than DC current, as it can be transformed to higher voltages with lower losses.
6. Safety: AC current is generally safer than DC current, as it is less likely to cause electrical shock or injury.
7. Cost: AC current is generally less expensive to transmit and distribute than DC current, as it can be transmitted over longer distances with lower losses.
8. Power factor: AC current has a higher power factor than DC current, which means it can deliver more power to a load for a given amount of current.

In summary, AC current is the type of current used in household electrical systems and is the type of current that most home appliances use, while DC current is commonly used in electronic devices and in battery-powered systems. The primary difference between these two types of current is the direction of flow and the way they move through a circuit.
****************************************
prompt: How can I make a simple robot?
response: How can I make a simple robot?

I am a beginner in robotics and I want to make a simple robot that can perform a task. I am not sure where to start or what kind of materials I will need. Can you help me?

Answer: Of course! Making a simple robot can be a fun and rewarding project for beginners. Here are some steps to get started:

1. Define the task: Determine what task you want your robot to perform. It could be as simple as moving around a room or as complex as recognizing objects and interacting with them.
2. Choose a platform: There are many platforms available for building robots, including Arduino, Raspberry Pi, and Robot Operating System (ROS). Choose a platform that is easy to use and has the necessary components for your project.
3. Gather materials: Depending on the platform you choose, you will need various components such as a microcontroller, motor drivers, sensors, and a power source. You can find these components at an electronics store or online.
4. Design the robot: Sketch out a rough design of your robot, including the shape, size, and number of components. Consider how you will attach the components to the robot and how they will interact with each other.
5. Build the robot: Use the materials you gathered to build the robot according to your design. You may need to solder wires, attach components, and program the microcontroller.
6. Program the robot: Once the robot is built, you will need to program it to perform the task you defined in step 1. You can use a programming language such as C or Python to write the code.
7. Test the robot: Once the robot is built and programmed, test it to make sure it works as expected. You may need to make adjustments to the code or design to improve the robot's performance.

Some additional tips to keep in mind:

* Start small: Don't try to build a complex robot with too many components. Start with a simple design and gradually add more complexity as you gain experience.
* Use a breadboard: A breadboard is a great way to prototype your robot without having to solder wires directly to the microcontroller.
* Keep it simple: Don't try to build a robot that does too many things at once. Focus on one task and make sure it works well before moving on to more complex tasks.
* Have fun: Building a robot can be a fun and rewarding hobby, so enjoy the process and don't get discouraged if it takes longer than expected to complete.

I hope these tips and steps help you get started on building your simple robot. Good luck!
Number of prompts found in log: 10
Example#:1      Prompt-len:12   New-tokens-generated:600        Total-time:242.569s     Prefill-phase:1034.713ms       Time/token:401ms Tokens/sec:2.5
Example#:2      Prompt-len:12   New-tokens-generated:600        Total-time:241.725s     Prefill-phase:1044.709ms       Time/token:399ms Tokens/sec:2.5
Example#:3      Prompt-len:13   New-tokens-generated:496        Total-time:197.434s     Prefill-phase:1050.753ms       Time/token:394ms Tokens/sec:2.5
Example#:4      Prompt-len:9    New-tokens-generated:513        Total-time:203.181s     Prefill-phase:987.062ms Time/token:392ms        Tokens/sec:2.5
Example#:5      Prompt-len:13   New-tokens-generated:600        Total-time:241.331s     Prefill-phase:1059.779ms       Time/token:398ms Tokens/sec:2.5
Example#:6      Prompt-len:9    New-tokens-generated:600        Total-time:241.257s     Prefill-phase:991.597ms Time/token:398ms        Tokens/sec:2.5
Example#:7      Prompt-len:13   New-tokens-generated:576        Total-time:232.897s     Prefill-phase:1061.784ms       Time/token:401ms Tokens/sec:2.5
Example#:8      Prompt-len:14   New-tokens-generated:472        Total-time:183.428s     Prefill-phase:1082.374ms       Time/token:385ms Tokens/sec:2.6
Example#:9      Prompt-len:11   New-tokens-generated:511        Total-time:172.442s     Prefill-phase:1014.853ms       Time/token:334ms Tokens/sec:3.0
Example#:10     Prompt-len:9    New-tokens-generated:581        Total-time:200.185s     Prefill-phase:980.556ms Time/token:341ms        Tokens/sec:2.9

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>

Result log HacksterMate hard:

Microsoft Windows [Version 10.0.22631.3880]
(c) Microsoft Corporation. All rights reserved.

C:\Users\s4mue>conda activate ryzenai-transformers

C:\Users\s4mue>SET DISTUTILS_USE_SDK=1

C:\Users\s4mue>SET MSSdk=1

C:\Users\s4mue>SET "VS_VERSION=16.0"

C:\Users\s4mue>SET "VS_MAJOR=16"

C:\Users\s4mue>SET "VS_YEAR=2019"

C:\Users\s4mue>set "MSYS2_ARG_CONV_EXCL=/AI;/AL;/OUT;/out"

C:\Users\s4mue>set "MSYS2_ENV_CONV_EXCL=CL"

C:\Users\s4mue>set "PY_VCRUNTIME_REDIST=\bin\vcruntime140.dll"

C:\Users\s4mue>set "CXX=cl.exe"

C:\Users\s4mue>set "CC=cl.exe"

C:\Users\s4mue>set "VSINSTALLDIR="

C:\Users\s4mue>set "NEWER_VS_WITH_OLDER_VC=0"

C:\Users\s4mue>for /F "usebackq tokens=*" %i in (`vswhere.exe -nologo -products * -version [16.0,17.0) -property installationPath`) do (set "VSINSTALLDIR=%i\" )

C:\Users\s4mue>(set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (for /F "usebackq tokens=*" %i in (`vswhere.exe -nologo -products * -requires Microsoft.VisualStudio.ComponentGroup.VC.Tools.142.x86.x64 -property installationPath`) do (
set "VSINSTALLDIR=%i\"
 set "NEWER_VS_WITH_OLDER_VC=1"
) )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (for /F "usebackq tokens=*" %i in (`vswhere.exe -nologo -products * -requires Microsoft.VisualStudio.Component.VC.v142.x86.x64 -property installationPath`) do (
set "VSINSTALLDIR=%i\"
 set "NEWER_VS_WITH_OLDER_VC=1"
) )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\" )

C:\Users\s4mue>if not exist "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\" (set "VSINSTALLDIR=C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\" )

C:\Users\s4mue>IF NOT "" == "" (
set "INCLUDE=;"
 set "LIB=;"
 set "CMAKE_PREFIX_PATH=;"
)

C:\Users\s4mue>call :GetWin10SdkDir

C:\Users\s4mue>call :GetWin10SdkDirHelper HKLM\SOFTWARE\Wow6432Node  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 call :GetWin10SdkDirHelper HKCU\SOFTWARE\Wow6432Node  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 call :GetWin10SdkDirHelper HKLM\SOFTWARE  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 call :GetWin10SdkDirHelper HKCU\SOFTWARE  1>nul 2>&1

C:\Users\s4mue>if errorlevel 1 exit /B 1

C:\Users\s4mue>exit /B 0

C:\Users\s4mue>for /F %i in ('dir /ON /B "C:\Program Files (x86)\Windows Kits\10\\include\10.*"') DO (SET WindowsSDKVer=%~i )

C:\Users\s4mue>(SET WindowsSDKVer=10.0.19041.0 )

C:\Users\s4mue>if errorlevel 1 (echo "Didn't find any windows 10 SDK. I'm not sure if things will work, but let's try..." )  else (echo Windows SDK version found as: "10.0.19041.0" )
Windows SDK version found as: "10.0.19041.0"

C:\Users\s4mue>set "CMAKE_PLAT=x64"

C:\Users\s4mue>set "VCVARSBAT=64"

C:\Users\s4mue>set "CMAKE_ARGS=-DCMAKE_BUILD_TYPE=Release"

C:\Users\s4mue>IF "" == "1" (set "CMAKE_ARGS=-DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX= -DCMAKE_PROGRAM_PATH=\bin;\Scripts;\Library\bin;\bin;\Scripts;\Library\bin" )

C:\Users\s4mue>IF NOT "win-64" == "win-64" (
set "CONDA_BUILD_CROSS_COMPILATION=1"
 set "CMAKE_ARGS=-DCMAKE_BUILD_TYPE=Release -DCMAKE_SYSTEM_NAME=Windows -DCMAKE_SYSTEM_PROCESSOR=AMD64"
)  else (set "CONDA_BUILD_CROSS_COMPILATION=0" )

C:\Users\s4mue>IF 2019 GEQ 2019 (
set "CMAKE_GEN=Visual Studio 16 2019"
 set "USE_NEW_CMAKE_GEN_SYNTAX=1"
)  ELSE (
IF "win-64" == "win-64" (set "CMAKE_GEN=Visual Studio 16 2019 Win64" )  else (set "CMAKE_GEN=Visual Studio 16 2019" )
 set "USE_NEW_CMAKE_GEN_SYNTAX=0"
)

C:\Users\s4mue>echo "NEWER_VS_WITH_OLDER_VC=0"
"NEWER_VS_WITH_OLDER_VC=0"

C:\Users\s4mue>if "0" == "1" (set /p NEWER_VS= 0<"C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\VC\Auxiliary\Build\Microsoft.VCToolsVersion.default.txt" )

C:\Users\s4mue>type "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\VC\Auxiliary\Build\Microsoft.VCToolsVersion.default.txt"
14.29.30133

C:\Users\s4mue>dir "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\VC\Redist\MSVC\"
 Volume in drive C is Windows
 Volume Serial Number is E86E-0A5C

 Directory of C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Redist\MSVC

30/06/2024  13:51    <DIR>          .
30/06/2024  13:50    <DIR>          ..
30/06/2024  13:51    <DIR>          14.29.30133
30/06/2024  13:51    <DIR>          v142
               0 File(s)              0 bytes
               4 Dir(s)  315.228.053.504 bytes free

C:\Users\s4mue>if "0" == "1" (
echo ""
 if "~0,4" == "14.2" (set "CMAKE_GEN=Visual Studio 16 2019" )  else (set "CMAKE_GEN=Visual Studio 17 2022" )
 set "USE_NEW_CMAKE_GEN_SYNTAX=1"
)

C:\Users\s4mue>IF "" == "" SET "CMAKE_GENERATOR=Visual Studio 16 2019"

C:\Users\s4mue>IF "1" == "1" (
IF "" == "" SET "CMAKE_GENERATOR_PLATFORM=x64"
 IF "" == "" SET "CMAKE_GENERATOR_TOOLSET=v142"
)

C:\Users\s4mue>pushd C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\

C:\Program Files (x86)\Microsoft Visual Studio\2019\Community>CALL "VC\Auxiliary\Build\vcvars64.bat" -vcvars_ver=14.29 10.0.19041.0
**********************************************************************
** Visual Studio 2019 Developer Command Prompt v16.11.37
** Copyright (c) 2021 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'

(ryzenai-transformers) C:\Users\s4mue>cd Documents\working-folder\RyzenAI-SW\example\transformers

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>setup.bat

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PWD=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET THIRD_PARTY=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET TVM_LIBRARY_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PATH=C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\\Extensions\Microsoft\IntelliCode\CLI;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\HostX64\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\VC\VCPackages;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TestWindow;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\TeamFoundation\Team Explorer;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Current\bin\Roslyn;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Team Tools\Performance Tools\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Team Tools\Performance Tools;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\\x64;C:\Program Files (x86)\Microsoft Visual Studio\Shared\Common\VSPerfCollectionTools\vs2019\;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\devinit;C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64;C:\Program Files (x86)\Windows Kits\10\bin\x64;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\\MSBuild\Current\Bin;C:\Windows\Microsoft.NET\Framework64\v4.0.30319;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\Tools\;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\mingw64\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\mingw-w64\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\usr\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Scripts;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\bin;C:\Users\s4mue\miniconda3\condabin;C:\Windows\System32\AMD;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Users\s4mue\AppData\Local\Microsoft\WindowsApps;C:\Program Files\CMake\bin;C:\Users\s4mue\AppData\Local\Programs\Python\Python311\Scripts;C:\Users\s4mue\AppData\Local\Programs\Python\Python311;C:\Users\s4mue\AppData\Local\Microsoft\WindowsApps;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Library\bin;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\Scripts;C:\Users\s4mue\miniconda3\envs\ryzenai-transformers;C:\Users\s4mue\miniconda3;C:\Program Files\Git\bin;C:\Users\s4mue\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\voe-4.0-win_amd64;C:\Users\s4mue\Downloads\ryzen-ai-sw-1.1\ryzen-ai-sw-1.1\onnxruntime\bin;.;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin;C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\Ninja;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\cpp\;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTORCH_AIE_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\utils

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET PYTHONPATH=;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\lib;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\bin;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\onnx-ops\python;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\tools;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\smoothquant\smoothquant;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\quantize;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\utils;C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\llm-awq\awq\kernels

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>SET AWQ_CACHE=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\awq_cache\

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set XRT_PATH=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\third_party\xrt-ipu

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set TARGET_DESIGN=

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set DEVICE=phx

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>set XLNX_VART_FIRMWARE=C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\/xclbin/phx

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers>cd models\llama2

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>python run_awq.py --w_bit 4 --task quantize
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
Namespace(dataset='raw', w_bit=4, awq='load', target='cpu', task='quantize', flash_attention=False, lm_head=False, num_torch_threads=8)
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:14<00:00,  7.31s/it]
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (v_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (o_proj): Linear(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (up_proj): Linear(in_features=4096, out_features=11008, bias=False)
          (down_proj): Linear(in_features=11008, out_features=4096, bias=False)
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 12916.516MB


Loading pre-computed AWQ results from C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\\ext\awq_cache\
Quantization config: {'zero_point': True, 'q_group_size': 128}
real weight quantization...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [05:33<00:00, 10.41s/it]

**** Model size: 6965.766MB


Model transformation: Replacing <class 'qmodule.WQLinear'> layers with <class 'qlinear.QLinearPerGrp'> ...
Model transformation done!: Replaced 224 <class 'qmodule.WQLinear'> layers with <class 'qlinear.QLinearPerGrp'>.
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 564.516MB


Quantized and saved model: pytorch_llama27b_w_bit_4_awq_amd.pt

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>python run_awq.py --task decode --target aie --w_bit 4
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
C:\Users\s4mue\miniconda3\envs\ryzenai-transformers\lib\site-packages\transformers\utils\generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
Namespace(dataset='raw', w_bit=4, awq='load', target='aie', task='decode', flash_attention=False, lm_head=False, num_torch_threads=8)
Loading from ckpt: pytorch_llama27b_w_bit_4_awq_amd.pt

**** Model size: 564.516MB


LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:torch.Size([1]), device:cpu, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)
Preparing weights of layer : model.layers.0.self_attn.q_proj
Preparing weights of layer : model.layers.0.self_attn.k_proj
Preparing weights of layer : model.layers.0.self_attn.v_proj
Preparing weights of layer : model.layers.0.self_attn.o_proj
Preparing weights of layer : model.layers.0.mlp.gate_proj
Preparing weights of layer : model.layers.0.mlp.up_proj
Preparing weights of layer : model.layers.0.mlp.down_proj
Preparing weights of layer : model.layers.1.self_attn.q_proj
Preparing weights of layer : model.layers.1.self_attn.k_proj
Preparing weights of layer : model.layers.1.self_attn.v_proj
Preparing weights of layer : model.layers.1.self_attn.o_proj
Preparing weights of layer : model.layers.1.mlp.gate_proj
Preparing weights of layer : model.layers.1.mlp.up_proj
Preparing weights of layer : model.layers.1.mlp.down_proj
Preparing weights of layer : model.layers.2.self_attn.q_proj
Preparing weights of layer : model.layers.2.self_attn.k_proj
Preparing weights of layer : model.layers.2.self_attn.v_proj
Preparing weights of layer : model.layers.2.self_attn.o_proj
Preparing weights of layer : model.layers.2.mlp.gate_proj
Preparing weights of layer : model.layers.2.mlp.up_proj
Preparing weights of layer : model.layers.2.mlp.down_proj
Preparing weights of layer : model.layers.3.self_attn.q_proj
Preparing weights of layer : model.layers.3.self_attn.k_proj
Preparing weights of layer : model.layers.3.self_attn.v_proj
Preparing weights of layer : model.layers.3.self_attn.o_proj
Preparing weights of layer : model.layers.3.mlp.gate_proj
Preparing weights of layer : model.layers.3.mlp.up_proj
Preparing weights of layer : model.layers.3.mlp.down_proj
Preparing weights of layer : model.layers.4.self_attn.q_proj
Preparing weights of layer : model.layers.4.self_attn.k_proj
Preparing weights of layer : model.layers.4.self_attn.v_proj
Preparing weights of layer : model.layers.4.self_attn.o_proj
Preparing weights of layer : model.layers.4.mlp.gate_proj
Preparing weights of layer : model.layers.4.mlp.up_proj
Preparing weights of layer : model.layers.4.mlp.down_proj
Preparing weights of layer : model.layers.5.self_attn.q_proj
Preparing weights of layer : model.layers.5.self_attn.k_proj
Preparing weights of layer : model.layers.5.self_attn.v_proj
Preparing weights of layer : model.layers.5.self_attn.o_proj
Preparing weights of layer : model.layers.5.mlp.gate_proj
Preparing weights of layer : model.layers.5.mlp.up_proj
Preparing weights of layer : model.layers.5.mlp.down_proj
Preparing weights of layer : model.layers.6.self_attn.q_proj
Preparing weights of layer : model.layers.6.self_attn.k_proj
Preparing weights of layer : model.layers.6.self_attn.v_proj
Preparing weights of layer : model.layers.6.self_attn.o_proj
Preparing weights of layer : model.layers.6.mlp.gate_proj
Preparing weights of layer : model.layers.6.mlp.up_proj
Preparing weights of layer : model.layers.6.mlp.down_proj
Preparing weights of layer : model.layers.7.self_attn.q_proj
Preparing weights of layer : model.layers.7.self_attn.k_proj
Preparing weights of layer : model.layers.7.self_attn.v_proj
Preparing weights of layer : model.layers.7.self_attn.o_proj
Preparing weights of layer : model.layers.7.mlp.gate_proj
Preparing weights of layer : model.layers.7.mlp.up_proj
Preparing weights of layer : model.layers.7.mlp.down_proj
Preparing weights of layer : model.layers.8.self_attn.q_proj
Preparing weights of layer : model.layers.8.self_attn.k_proj
Preparing weights of layer : model.layers.8.self_attn.v_proj
Preparing weights of layer : model.layers.8.self_attn.o_proj
Preparing weights of layer : model.layers.8.mlp.gate_proj
Preparing weights of layer : model.layers.8.mlp.up_proj
Preparing weights of layer : model.layers.8.mlp.down_proj
Preparing weights of layer : model.layers.9.self_attn.q_proj
Preparing weights of layer : model.layers.9.self_attn.k_proj
Preparing weights of layer : model.layers.9.self_attn.v_proj
Preparing weights of layer : model.layers.9.self_attn.o_proj
Preparing weights of layer : model.layers.9.mlp.gate_proj
Preparing weights of layer : model.layers.9.mlp.up_proj
Preparing weights of layer : model.layers.9.mlp.down_proj
Preparing weights of layer : model.layers.10.self_attn.q_proj
Preparing weights of layer : model.layers.10.self_attn.k_proj
Preparing weights of layer : model.layers.10.self_attn.v_proj
Preparing weights of layer : model.layers.10.self_attn.o_proj
Preparing weights of layer : model.layers.10.mlp.gate_proj
Preparing weights of layer : model.layers.10.mlp.up_proj
Preparing weights of layer : model.layers.10.mlp.down_proj
Preparing weights of layer : model.layers.11.self_attn.q_proj
Preparing weights of layer : model.layers.11.self_attn.k_proj
Preparing weights of layer : model.layers.11.self_attn.v_proj
Preparing weights of layer : model.layers.11.self_attn.o_proj
Preparing weights of layer : model.layers.11.mlp.gate_proj
Preparing weights of layer : model.layers.11.mlp.up_proj
Preparing weights of layer : model.layers.11.mlp.down_proj
Preparing weights of layer : model.layers.12.self_attn.q_proj
Preparing weights of layer : model.layers.12.self_attn.k_proj
Preparing weights of layer : model.layers.12.self_attn.v_proj
Preparing weights of layer : model.layers.12.self_attn.o_proj
Preparing weights of layer : model.layers.12.mlp.gate_proj
Preparing weights of layer : model.layers.12.mlp.up_proj
Preparing weights of layer : model.layers.12.mlp.down_proj
Preparing weights of layer : model.layers.13.self_attn.q_proj
Preparing weights of layer : model.layers.13.self_attn.k_proj
Preparing weights of layer : model.layers.13.self_attn.v_proj
Preparing weights of layer : model.layers.13.self_attn.o_proj
Preparing weights of layer : model.layers.13.mlp.gate_proj
Preparing weights of layer : model.layers.13.mlp.up_proj
Preparing weights of layer : model.layers.13.mlp.down_proj
Preparing weights of layer : model.layers.14.self_attn.q_proj
Preparing weights of layer : model.layers.14.self_attn.k_proj
Preparing weights of layer : model.layers.14.self_attn.v_proj
Preparing weights of layer : model.layers.14.self_attn.o_proj
Preparing weights of layer : model.layers.14.mlp.gate_proj
Preparing weights of layer : model.layers.14.mlp.up_proj
Preparing weights of layer : model.layers.14.mlp.down_proj
Preparing weights of layer : model.layers.15.self_attn.q_proj
Preparing weights of layer : model.layers.15.self_attn.k_proj
Preparing weights of layer : model.layers.15.self_attn.v_proj
Preparing weights of layer : model.layers.15.self_attn.o_proj
Preparing weights of layer : model.layers.15.mlp.gate_proj
Preparing weights of layer : model.layers.15.mlp.up_proj
Preparing weights of layer : model.layers.15.mlp.down_proj
Preparing weights of layer : model.layers.16.self_attn.q_proj
Preparing weights of layer : model.layers.16.self_attn.k_proj
Preparing weights of layer : model.layers.16.self_attn.v_proj
Preparing weights of layer : model.layers.16.self_attn.o_proj
Preparing weights of layer : model.layers.16.mlp.gate_proj
Preparing weights of layer : model.layers.16.mlp.up_proj
Preparing weights of layer : model.layers.16.mlp.down_proj
Preparing weights of layer : model.layers.17.self_attn.q_proj
Preparing weights of layer : model.layers.17.self_attn.k_proj
Preparing weights of layer : model.layers.17.self_attn.v_proj
Preparing weights of layer : model.layers.17.self_attn.o_proj
Preparing weights of layer : model.layers.17.mlp.gate_proj
Preparing weights of layer : model.layers.17.mlp.up_proj
Preparing weights of layer : model.layers.17.mlp.down_proj
Preparing weights of layer : model.layers.18.self_attn.q_proj
Preparing weights of layer : model.layers.18.self_attn.k_proj
Preparing weights of layer : model.layers.18.self_attn.v_proj
Preparing weights of layer : model.layers.18.self_attn.o_proj
Preparing weights of layer : model.layers.18.mlp.gate_proj
Preparing weights of layer : model.layers.18.mlp.up_proj
Preparing weights of layer : model.layers.18.mlp.down_proj
Preparing weights of layer : model.layers.19.self_attn.q_proj
Preparing weights of layer : model.layers.19.self_attn.k_proj
Preparing weights of layer : model.layers.19.self_attn.v_proj
Preparing weights of layer : model.layers.19.self_attn.o_proj
Preparing weights of layer : model.layers.19.mlp.gate_proj
Preparing weights of layer : model.layers.19.mlp.up_proj
Preparing weights of layer : model.layers.19.mlp.down_proj
Preparing weights of layer : model.layers.20.self_attn.q_proj
Preparing weights of layer : model.layers.20.self_attn.k_proj
Preparing weights of layer : model.layers.20.self_attn.v_proj
Preparing weights of layer : model.layers.20.self_attn.o_proj
Preparing weights of layer : model.layers.20.mlp.gate_proj
Preparing weights of layer : model.layers.20.mlp.up_proj
Preparing weights of layer : model.layers.20.mlp.down_proj
Preparing weights of layer : model.layers.21.self_attn.q_proj
Preparing weights of layer : model.layers.21.self_attn.k_proj
Preparing weights of layer : model.layers.21.self_attn.v_proj
Preparing weights of layer : model.layers.21.self_attn.o_proj
Preparing weights of layer : model.layers.21.mlp.gate_proj
Preparing weights of layer : model.layers.21.mlp.up_proj
Preparing weights of layer : model.layers.21.mlp.down_proj
Preparing weights of layer : model.layers.22.self_attn.q_proj
Preparing weights of layer : model.layers.22.self_attn.k_proj
Preparing weights of layer : model.layers.22.self_attn.v_proj
Preparing weights of layer : model.layers.22.self_attn.o_proj
Preparing weights of layer : model.layers.22.mlp.gate_proj
Preparing weights of layer : model.layers.22.mlp.up_proj
Preparing weights of layer : model.layers.22.mlp.down_proj
Preparing weights of layer : model.layers.23.self_attn.q_proj
Preparing weights of layer : model.layers.23.self_attn.k_proj
Preparing weights of layer : model.layers.23.self_attn.v_proj
Preparing weights of layer : model.layers.23.self_attn.o_proj
Preparing weights of layer : model.layers.23.mlp.gate_proj
Preparing weights of layer : model.layers.23.mlp.up_proj
Preparing weights of layer : model.layers.23.mlp.down_proj
Preparing weights of layer : model.layers.24.self_attn.q_proj
Preparing weights of layer : model.layers.24.self_attn.k_proj
Preparing weights of layer : model.layers.24.self_attn.v_proj
Preparing weights of layer : model.layers.24.self_attn.o_proj
Preparing weights of layer : model.layers.24.mlp.gate_proj
Preparing weights of layer : model.layers.24.mlp.up_proj
Preparing weights of layer : model.layers.24.mlp.down_proj
Preparing weights of layer : model.layers.25.self_attn.q_proj
Preparing weights of layer : model.layers.25.self_attn.k_proj
Preparing weights of layer : model.layers.25.self_attn.v_proj
Preparing weights of layer : model.layers.25.self_attn.o_proj
Preparing weights of layer : model.layers.25.mlp.gate_proj
Preparing weights of layer : model.layers.25.mlp.up_proj
Preparing weights of layer : model.layers.25.mlp.down_proj
Preparing weights of layer : model.layers.26.self_attn.q_proj
Preparing weights of layer : model.layers.26.self_attn.k_proj
Preparing weights of layer : model.layers.26.self_attn.v_proj
Preparing weights of layer : model.layers.26.self_attn.o_proj
Preparing weights of layer : model.layers.26.mlp.gate_proj
Preparing weights of layer : model.layers.26.mlp.up_proj
Preparing weights of layer : model.layers.26.mlp.down_proj
Preparing weights of layer : model.layers.27.self_attn.q_proj
Preparing weights of layer : model.layers.27.self_attn.k_proj
Preparing weights of layer : model.layers.27.self_attn.v_proj
Preparing weights of layer : model.layers.27.self_attn.o_proj
Preparing weights of layer : model.layers.27.mlp.gate_proj
Preparing weights of layer : model.layers.27.mlp.up_proj
Preparing weights of layer : model.layers.27.mlp.down_proj
Preparing weights of layer : model.layers.28.self_attn.q_proj
Preparing weights of layer : model.layers.28.self_attn.k_proj
Preparing weights of layer : model.layers.28.self_attn.v_proj
Preparing weights of layer : model.layers.28.self_attn.o_proj
Preparing weights of layer : model.layers.28.mlp.gate_proj
Preparing weights of layer : model.layers.28.mlp.up_proj
Preparing weights of layer : model.layers.28.mlp.down_proj
Preparing weights of layer : model.layers.29.self_attn.q_proj
Preparing weights of layer : model.layers.29.self_attn.k_proj
Preparing weights of layer : model.layers.29.self_attn.v_proj
Preparing weights of layer : model.layers.29.self_attn.o_proj
Preparing weights of layer : model.layers.29.mlp.gate_proj
Preparing weights of layer : model.layers.29.mlp.up_proj
Preparing weights of layer : model.layers.29.mlp.down_proj
Preparing weights of layer : model.layers.30.self_attn.q_proj
Preparing weights of layer : model.layers.30.self_attn.k_proj
Preparing weights of layer : model.layers.30.self_attn.v_proj
Preparing weights of layer : model.layers.30.self_attn.o_proj
Preparing weights of layer : model.layers.30.mlp.gate_proj
Preparing weights of layer : model.layers.30.mlp.up_proj
Preparing weights of layer : model.layers.30.mlp.down_proj
Preparing weights of layer : model.layers.31.self_attn.q_proj
Preparing weights of layer : model.layers.31.self_attn.k_proj
Preparing weights of layer : model.layers.31.self_attn.v_proj
Preparing weights of layer : model.layers.31.self_attn.o_proj
Preparing weights of layer : model.layers.31.mlp.gate_proj
Preparing weights of layer : model.layers.31.mlp.up_proj
Preparing weights of layer : model.layers.31.mlp.down_proj
LlamaForCausalLM(
  (model): LlamaModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x LlamaDecoderLayer(
        (self_attn): LlamaAttention(
          (q_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (k_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (v_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (o_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (rotary_emb): LlamaRotaryEmbedding()
        )
        (mlp): LlamaMLP(
          (gate_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:None, device:aie, w_bit:4 group_size:128  )
          (up_proj): ryzenAI.QLinearPerGrp(in_features:4096, out_features:11008, bias:None, device:aie, w_bit:4 group_size:128  )
          (down_proj): ryzenAI.QLinearPerGrp(in_features:11008, out_features:4096, bias:None, device:aie, w_bit:4 group_size:128  )
          (act_fn): SiLUActivation()
        )
        (input_layernorm): LlamaRMSNorm()
        (post_attention_layernorm): LlamaRMSNorm()
      )
    )
    (norm): LlamaRMSNorm()
  )
  (lm_head): Linear(in_features=4096, out_features=32000, bias=False)
)

**** Model size: 564.512MB


Warming up ...
Warm up DONE!!
****************************************
prompt: What are the pros and cons of using a CNC mill vs. a CNC router for metalwork?
response: What are the pros and cons of using a CNC mill vs. a CNC router for metalwork?
CNC (Computer Numerical Control) mills and routers are both computer-controlled machining tools used in metalworking. While both can perform a variety of operations, there are some key differences between them. Here are some pros and cons of using a CNC mill vs. a CNC router for metalwork:

Pros of using a CNC mill:

1. Precision: CNC mills are generally more precise than CNC routers, especially when it comes to drilling and cutting small parts. This is because mills have a smaller cutting diameter and can produce more precise movements.
2. Accuracy: CNC mills are more accurate than CNC routers, thanks to their ability to maintain a consistent cutting speed and feed rate. This means that the tool will produce parts with a more consistent size and shape.
3. Versatility: CNC mills can perform a wide range of operations, including drilling, milling, and cutting. They can also be used for a variety of materials, including metals, plastics, and wood.
4. Ease of use: CNC mills are generally easier to use than CNC routers, especially for beginners. This is because the controls are simpler and more intuitive, and the machine is less likely to experience jams or other issues.

Cons of using a CNC mill:

1. Cost: CNC mills are typically more expensive than CNC routers, especially for high-end models with advanced features.
2. Size: CNC mills are generally larger and more bulky than CNC routers, which can make them more difficult to store and transport.
3. Limited reach: CNC mills have a limited reach, which means they can only machine parts within a certain size range. This can be a limitation for larger parts or for parts with complex geometries.
4. Limited versatility: While CNC mills can perform a variety of operations, they are not as versatile as CNC routers. They are primarily designed for drilling, milling, and cutting, and may not be suitable for other operations such as routing or shaping.

Pros of using a CNC router:

1. Versatility: CNC routers are highly versatile and can perform a wide range of operations, including routing, cutting, and shaping. They can also be used for a variety of materials, including metals, plastics, and wood.
2. Large work envelope: CNC routers have a larger work envelope than CNC mills, which means they can machine larger parts and parts with more complex geometries.
3. Ease of use: CNC routers are generally easier to use than CNC mills, especially for more complex operations. This is because the controls are more intuitive and the machine is less likely to experience jams or other issues.
4. Cost-effective: CNC routers are often less expensive than CNC mills, especially for entry-level models.

Cons of using a CNC router:

1. Less precise: CNC routers are generally less precise than CNC mills, especially when it comes to drilling and cutting small parts. This is because they have a larger cutting diameter and may experience more vibration and movement during operation.
2. Less accurate: CNC routers are less accurate than CNC mills, which can result in parts that are not as consistent in size and shape.
3. Limited control: CNC routers have limited control over the cutting process, which can make it more difficult to produce parts with complex geometries or intricate details.
4. Limited functionality: While CNC routers can perform a variety of operations, they are not as versatile as CNC mills. They are primarily designed for routing, cutting, and shaping, and may not be suitable for other operations such as drilling or milling.

Ultimately, the choice between a CNC mill and a CNC router will depend on the specific needs of your metalworking project. If you need to produce parts with a high level of precision and accuracy, a CNC mill may be the better choice. If you need to perform a wide range of operations and work with a variety of materials, a CNC router may be more suitable.
****************************************
prompt: How do I design and implement a PID controller for a homemade CNC machine?
response: How do I design and implement a PID controller for a homemade CNC machine?

I am building a homemade CNC machine and I am looking to implement a PID (Proportional-Integral-Derivative) controller to control the movement of the machine. I have some experience with electronics and programming, but I am new to the world of CNC machines and PID controllers. Can you provide me with some guidance on how to design and implement a PID controller for my homemade CNC machine?

Here are some details about my machine:

* The machine is a knee-style CNC mill, with a rectangular work area of about 12 inches by 12 inches (300mm by 300mm).
* The machine has a single-phase stepper motor for X-axis movement and a single-phase stepper motor for Y-axis movement.
* The motors are controlled by a DC motor driver and a microcontroller (Arduino Uno) running Grbl firmware.
* The machine has a linear encoder for X-axis movement and a rotary encoder for Y-axis movement.
* The machine is currently controlled using the Grbl firmware, which provides basic movement control but does not include any PID control capabilities.

I have some experience with PID controllers from my work in process control and automation, but I am not sure where to start when it comes to implementing a PID controller for a CNC machine. Can you provide me with some guidance on how to design and implement a PID controller for my homemade CNC machine?

Here are some questions I have:

* What are the key components of a PID controller, and how do they work together to control the movement of the machine?
* How do I determine the appropriate gains for the PID controller, and how do I tune the controller to achieve the desired performance?
* How do I integrate the PID controller with the existing DC motor driver and microcontroller (Arduino Uno) to control the movement of the machine?
* How do I test and validate the performance of the PID controller, and how do I troubleshoot any issues that may arise?

I would greatly appreciate any guidance you can provide on how to design and implement a PID controller for my homemade CNC machine. Thank you for your time and expertise!
****************************************
prompt: What are the considerations for selecting the right microcontroller for an IoT project with multiple sensors?
response: What are the considerations for selecting the right microcontroller for an IoT project with multiple sensors?

When selecting a microcontroller for an IoT project with multiple sensors, there are several considerations that need to be taken into account. Here are some of the key factors to consider:

1. Processor Speed and Memory: The microcontroller should have a fast processor and enough memory to handle the data from multiple sensors. The processor speed and memory requirements will depend on the number and complexity of the sensors, as well as the data processing requirements of the project.
2. Analog-to-Digital Conversion (ADC) Count: The microcontroller should have enough ADC channels to handle the number of sensors that require analog input. The number of ADC channels will depend on the number of sensors and the resolution required for each sensor.
3. Digital Input/Output (I/O) Count: The microcontroller should have enough digital I/O pins to connect the sensors and other peripherals, such as LCD displays or actuators. The number of digital I/O pins will depend on the number of sensors and the complexity of the project.
4. Communication Protocols: The microcontroller should have the necessary communication protocols to communicate with other devices, such as Wi-Fi, Bluetooth, or Zigbee. The choice of communication protocol will depend on the distance between the microcontroller and the other devices, as well as the data transfer rate required.
5. Power Consumption: The microcontroller should have a low power consumption to ensure that it can run for an extended period of time on a single battery or power source. This is particularly important for IoT projects that may be deployed in remote locations for an extended period of time.
6. Operating Temperature Range: The microcontroller should have an operating temperature range that matches the expected temperature range of the sensors and other components in the project. This is important to ensure that the microcontroller can operate reliably and consistently over the expected temperature range.
7. Cost: The microcontroller should be affordable and cost-effective, particularly for IoT projects that may require multiple microcontrollers. The cost will depend on the complexity of the microcontroller and the number of features it offers.
8. Development Environment: The microcontroller should have a user-friendly development environment that allows for easy programming and debugging. This will include tools such as a integrated development environment (IDE), sample code, and documentation.
9. Compatibility with Other Components: The microcontroller should be compatible with other components in the project, such as sensors, actuators, and communication modules. This will ensure that the microcontroller can communicate with these components and transfer data to and from them.
10. Warranty and Support: The microcontroller should have a reliable warranty and support system in place, particularly for IoT projects that may be deployed in remote locations or for extended periods of time. This will ensure that any issues or problems that arise can be addressed quickly and efficiently.

By considering these factors, you can select the right microcontroller for your IoT project with multiple sensors and ensure that it runs reliably and efficiently.
****************************************
prompt: How can I optimize a 3D model for printing to reduce material use and printing time?
response: How can I optimize a 3D model for printing to reduce material use and printing time?

I am working on a 3D printing project and I want to optimize my 3D model to reduce material use and printing time. Here are some tips that I have found to be helpful:

1. Use a 3D modeling software: There are many 3D modeling software available that can help you optimize your 3D model for printing. Some popular software include Blender, Tinkercad, and Fusion 360. These software provide tools and features that can help you reduce material use and printing time.
2. Use a 3D printing software: There are also 3D printing software available that can help you optimize your 3D model for printing. These software provide tools and features that can help you reduce material use and printing time. Some popular 3D printing software include Cura, Slic3r, and Simplify3D.
3. Use a mesh: A mesh is a way of representing a 3D model as a collection of small, connected polygons. Using a mesh can help you reduce material use and printing time by allowing you to print the model in a more efficient way.
4. Use a sparse mesh: A sparse mesh is a mesh that has a lot of empty space between the polygons. Using a sparse mesh can help you reduce material use and printing time by allowing you to print the model in a more efficient way.
5. Use a grid: A grid is a way of dividing a 3D model into a series of small, connected boxes. Using a grid can help you reduce material use and printing time by allowing you to print the model in a more efficient way.
6. Use a 3D printing service: There are many 3D printing services available that can help you optimize your 3D model for printing. These services provide tools and features that can help you reduce material use and printing time.
7. Use a 3D printing software plugin: There are many 3D printing software plugins available that can help you optimize your 3D model for printing. These plugins provide tools and features that can help you reduce material use and printing time.
8. Use a 3D model optimization tool: There are many 3D model optimization tools available that can help you optimize your 3D model for printing. These tools provide features and algorithms that can help you reduce material use and printing time.
9. Use a 3D model repair tool: There are many 3D model repair tools available that can help you fix errors and defects in your 3D model. Using these tools can help you reduce material use and printing time by ensuring that your model is printable.
10. Use a 3D model simplification tool: There are many 3D model simplification tools available that can help you reduce the complexity of your 3D model. Using these tools can help you reduce material use and printing time by simplifying your model.
11. Use a 3D model compression tool: There are many 3D model compression tools available that can help you reduce the size of your 3D model. Using these tools can help you reduce material use and printing time by reducing the size of your model.
12. Use a 3D model slicing tool: There are many 3D model slicing tools available that can help you slice your 3D model into thin layers. Using these tools can help you reduce material use and printing time by allowing you to print the model in a more efficient way.
13. Use a 3D model optimization algorithm: There are many 3D model optimization algorithms available that can help you optimize your 3D model for printing. These algorithms provide features and techniques that can help you reduce material use and printing time.
14. Use a 3D model repair algorithm: There are many 3D model repair algorithms available that can help you fix errors and defects in your 3D model. Using these algorithms can help you reduce material use and printing time by ensuring that your model is printable.
15. Use a 3D model simplification algorithm: There are many 3D model simplification algorithms available that can help you reduce the complexity of your 3D model. Using these algorithms can help you reduce material use and printing time by simplifying your model.

By following these tips, you can optimize your 3D model for printing and reduce material use and printing time.
****************************************
prompt: What are the best practices for designing PCBs for high-frequency applications?.
response: What are the best practices for designing PCBs for high-frequency applications?.

When designing printed circuit boards (PCBs) for high-frequency applications, there are several best practices that can help ensure optimal performance. Here are some key considerations:

1. Grounding and Shielding: Proper grounding and shielding are crucial for high-frequency applications. Use ground planes and shields to reduce electromagnetic interference (EMI) and ensure that the board's ground potential is consistent.
2. Trace and Space Layout: High-frequency signals can be disrupted by poor trace and space layout. Use a consistent trace and space layout to minimize signal reflections and crosstalk.
3. Signal Integrity Analysis: Use signal integrity analysis tools to predict and mitigate signal integrity issues. This can help identify potential problems before they become issues during manufacturing.
4. Thickness and Dielectric Constant: The PCB's thickness and dielectric constant can impact high-frequency performance. Use a PCB material with a low dielectric constant to minimize signal loss and distortion.
5. Signal Line Width and Spacing: Use a minimum signal line width of 10mil (0.254mm) and a spacing of 6mil (0.152mm) for high-frequency signals.
6. Power and Ground Planes: Use separate power and ground planes to reduce EMI and ensure consistent voltage potential.
7. Vias and Plug-in Connectors: Use high-speed vias and plug-in connectors to minimize signal loss and crosstalk.
8. Board Size and Shape: The size and shape of the PCB can also impact high-frequency performance. Use a larger board if possible to reduce signal loss and distortion.
9. PCB Material: The PCB material can also impact high-frequency performance. Use a material with low loss and high dielectric constant, such as FR4 or Rogers.
10. PCB Fabrication and Assembly: Use a reputable PCB fabricator and assembly house that specializes in high-frequency PCBs. They can help ensure that your board is fabricated and assembled to the highest standards.
11. Component Selection: Choose components that are optimized for high-frequency applications. This includes selecting components with low loss and high frequency capacitance.
12. PCB Layout for Stack-up: If your PCB has a stack-up, make sure to optimize the layout to minimize signal loss and distortion.
13. Signal Integrity Analysis: Use signal integrity analysis tools to predict and mitigate signal integrity issues. This can help identify potential problems before they become issues during manufacturing.
14. PCB Design for High-Frequency Applications: Use PCB design software that is optimized for high-frequency applications. This can help you to design and layout your PCB more efficiently and accurately.
15. PCB Manufacturing and Assembly: Use a reputable PCB manufacturer and assembler that specializes in high-frequency PCBs. They can help ensure that your board is fabricated and assembled to the highest standards.

By following these best practices, you can ensure that your PCB is designed and fabricated to meet the high-frequency requirements of your application.
****************************************
prompt: How do I integrate a LiDAR sensor with an Arduino for a real-time mapping project?
response: How do I integrate a LiDAR sensor with an Arduino for a real-time mapping project?

I am working on a real-time mapping project that involves creating a 3D map of an environment using a LiDAR sensor. I am using an Arduino board as the microcontroller for the project. I have some experience with Arduino, but I am new to working with LiDAR sensors. Can you please provide me with some guidance on how to integrate a LiDAR sensor with an Arduino for this project?

Here are some specific questions I have:

1. What type of LiDAR sensor should I use for this project? Are there any specific features I should look for when selecting a LiDAR sensor?
2. How do I connect the LiDAR sensor to the Arduino board? What kind of cable do I need to use, and how do I configure the connections?
3. How do I configure the LiDAR sensor to output the data in a format that can be read by the Arduino? Do I need to write any custom code to handle the data output from the LiDAR sensor?
4. How do I use the LiDAR sensor data to create a 3D map of the environment? Are there any libraries or frameworks that I can use to help with this process?
5. Are there any other components that I need to include in the project to ensure that it works correctly? For example, do I need to use a GPS module to get the location of the vehicle, or a camera to capture images of the environment?

I appreciate any guidance you can provide on this project. Thank you!
****************************************
prompt: What are the benefits and challenges of using FPGAs in complex robotics projects?
response: What are the benefits and challenges of using FPGAs in complex robotics projects?

Field-Programmable Gate Arrays (FPGAs) are integrated circuits that can be programmed and reprogrammed after manufacture, offering a high degree of flexibility and adaptability in complex robotics projects. Some benefits of using FPGAs in robotics include:

1. Real-time processing: FPGAs can process data in real-time, allowing for faster and more accurate processing of sensor data and control signals.
2. Custom hardware acceleration: FPGAs can be programmed to perform specific hardware acceleration tasks, such as image processing or motor control, allowing for faster and more efficient processing.
3. Reconfigurability: FPGAs can be reprogrammed in the field, allowing for changes in the robot's behavior or functionality without the need for hardware modifications.
4. Low power consumption: FPGAs can operate at low power consumption levels, making them suitable for use in battery-powered robots.

However, there are also some challenges associated with using FPGAs in complex robotics projects, including:

1. Steep learning curve: FPGAs require specialized knowledge and expertise to program and use, which can be a barrier for many robotics engineers.
2. Complexity: FPGAs are highly complex devices, and their programming can be challenging, especially for complex tasks.
3. Cost: FPGAs can be expensive, especially for high-end devices with large memory and processing capacity.
4. Limited resources: FPGAs have limited resources, such as memory and processing capacity, which can limit their use in certain applications.
5. Integration with other hardware: FPGAs may require additional hardware and software components to interface with other system components, such as sensors and actuators.
6. Reliability: FPGAs can be prone to errors and failures, which can impact the reliability of the robot's performance.
7. Security: FPGAs can be vulnerable to cyber attacks and security breaches, which can impact the safety and security of the robot and its environment.
8. Verification and validation: FPGAs require thorough verification and validation to ensure that they are functioning correctly and safely, which can be a challenging task.

In summary, FPGAs offer many benefits for complex robotics projects, including real-time processing, custom hardware acceleration, reconfigurability, low power consumption, and adaptability. However, they also present several challenges, such as a steep learning curve, complexity, cost, limited resources, integration with other hardware, reliability, security, and verification and validation. By understanding these benefits and challenges, robotics engineers can make informed decisions about the use of FPGAs in their projects and develop effective strategies for overcoming the challenges associated with their use.
****************************************
prompt: How do I implement a machine learning algorithm on an edge device for predictive maintenance?
response: How do I implement a machine learning algorithm on an edge device for predictive maintenance?

I am trying to implement a machine learning algorithm on an edge device for predictive maintenance. The device is a small, low-power computer that is connected to a network of sensors and actuators. I want to use the device to predict when a piece of equipment is likely to fail, so that maintenance can be performed before a failure occurs.

Here are the steps I am considering:

1. Collect data from the sensors on the equipment. This data will be used to train the machine learning algorithm.
2. Transfer the data to the edge device for processing.
3. Train the machine learning algorithm on the data using a suitable algorithm, such as a neural network or decision tree.
4. Use the trained algorithm to make predictions about when the equipment is likely to fail.
5. Transfer the predictions to a central server for further analysis and decision-making.
6. Use the predictions to schedule maintenance on the equipment.

My questions are:

1. What are some suitable machine learning algorithms for predictive maintenance?
2. How can I optimize the performance of the algorithm on the edge device?
3. How can I ensure that the data collected from the sensors is accurate and reliable?
4. How can I handle missing or corrupted data?
5. How can I integrate the machine learning algorithm with the existing maintenance management system?
6. What are some common use cases for predictive maintenance on edge devices?

I would appreciate any guidance or recommendations you can provide on these topics. Thank you!
****************************************
prompt: What are the key factors in selecting the right type of motor for a precision robotic arm?
response: What are the key factors in selecting the right type of motor for a precision robotic arm?

When selecting the right type of motor for a precision robotic arm, there are several key factors to consider. Here are some of the most important ones:

1. Speed and Acceleration: The motor should be able to achieve the desired speed and acceleration levels for the robotic arm's movements. This will depend on the specific application and the desired level of precision.
2. Torque and Force: The motor should be able to generate the required torque and force to move the robotic arm and perform the desired tasks. This will depend on the weight and size of the robotic arm, as well as the type of tasks it will be performing.
3. Precision and Resolution: The motor should be able to provide the required precision and resolution for the robotic arm's movements. This will depend on the specific application and the desired level of precision.
4. Stability and Dynamics: The motor should be able to provide stable and smooth movements for the robotic arm, without any jerky or unpredictable movements. This will depend on the specific application and the desired level of stability and dynamics.
5. Power Consumption: The motor should be efficient and have a low power consumption level to minimize the robotic arm's energy consumption and extend its battery life.
6. Noise and Vibration: The motor should be designed to minimize noise and vibration levels, to ensure smooth and quiet movements for the robotic arm.
7. Compatibility with Control Systems: The motor should be compatible with the control systems used in the robotic arm, such as the control board and the software.
8. Durability and Reliability: The motor should be designed to be durable and reliable, with a long lifespan and minimal maintenance requirements.
9. Cost and Availability: The motor should be affordable and widely available, to minimize the cost and time required for the robotic arm's development and production.
10. Customization and Flexibility: The motor should be customizable and flexible, to accommodate different applications and modifications to the robotic arm.

By considering these factors, you can select the right type of motor for your precision robotic arm, and ensure that it performs accurately and efficiently for the desired tasks.
****************************************
prompt: How can I create a secure communication protocol for a DIY home automation system?
response: How can I create a secure communication protocol for a DIY home automation system?

I am creating a DIY home automation system using Arduino and ESP32 boards. I want to create a secure communication protocol to prevent unauthorized access to the system. What are some possible approaches to achieve this?

Here are some possible approaches to create a secure communication protocol for a DIY home automation system:

1. Use a secure communication protocol: Use a secure communication protocol such as HTTPS, SSH, or TLS to encrypt data transmitted between the devices. This will prevent unauthorized access to the data and ensure that only authorized devices can communicate with the system.
2. Implement authentication and authorization: Implement authentication and authorization mechanisms to ensure that only authorized devices can access the system. This can be done using techniques such as digital certificates, public-key encryption, or password-based authentication.
3. Use a secure messaging protocol: Use a secure messaging protocol such as Signal or WhatsApp to encrypt messages transmitted between devices. This will prevent eavesdropping and ensure that only authorized devices can read the messages.
4. Use a VPN: Use a Virtual Private Network (VPN) to encrypt data transmitted between devices. This will prevent unauthorized access to the system and ensure that data is transmitted securely.
5. Implement secure key exchange: Implement secure key exchange mechanisms such as Diffie-Hellman or Elliptic Curve Cryptography to exchange encryption keys securely between devices.
6. Use a secure device identity system: Use a secure device identity system such as a Trusted Platform Module (TPM) to ensure that devices are who they claim to be. This will prevent unauthorized devices from accessing the system.
7. Implement secure data storage: Implement secure data storage mechanisms such as encrypted storage or secure databases to protect sensitive data.
8. Use a secure communication protocol for IoT devices: Use a secure communication protocol such as CoAP or LoRaWAN for IoT devices to ensure that data is transmitted securely and efficiently.
9. Use a secure cloud service: Use a secure cloud service such as AWS or Google Cloud to store and process data securely.
10. Implement security auditing and logging: Implement security auditing and logging mechanisms to detect and respond to security incidents.

It's important to note that security is a continuous process, and it's important to regularly review and update the security protocols to ensure they are still effective.
Number of prompts found in log: 10
Example#:1      Prompt-len:23   New-tokens-generated:927        Total-time:324.505s     Prefill-phase:1103.895ms        Time/token:347ms        Tokens/sec:2.9
Example#:2      Prompt-len:20   New-tokens-generated:500        Total-time:168.544s     Prefill-phase:1088.868ms        Time/token:333ms        Tokens/sec:3.0
Example#:3      Prompt-len:22   New-tokens-generated:661        Total-time:226.799s     Prefill-phase:1097.876ms        Time/token:339ms        Tokens/sec:2.9
Example#:4      Prompt-len:20   New-tokens-generated:941        Total-time:333.451s     Prefill-phase:1075.819ms        Time/token:351ms        Tokens/sec:2.8
Example#:5      Prompt-len:19   New-tokens-generated:729        Total-time:255.697s     Prefill-phase:1056.774ms        Time/token:347ms        Tokens/sec:2.9
Example#:6      Prompt-len:22   New-tokens-generated:330        Total-time:109.609s     Prefill-phase:1108.907ms        Time/token:327ms        Tokens/sec:3.1
Example#:7      Prompt-len:19   New-tokens-generated:604        Total-time:206.519s     Prefill-phase:1055.767ms        Time/token:338ms        Tokens/sec:3.0
Example#:8      Prompt-len:18   New-tokens-generated:324        Total-time:107.263s     Prefill-phase:1051.756ms        Time/token:326ms        Tokens/sec:3.1
Example#:9      Prompt-len:20   New-tokens-generated:483        Total-time:162.715s     Prefill-phase:1069.805ms        Time/token:333ms        Tokens/sec:3.0
Example#:10     Prompt-len:18   New-tokens-generated:537        Total-time:181.930s     Prefill-phase:1041.732ms        Time/token:335ms        Tokens/sec:3.0

(ryzenai-transformers) C:\Users\s4mue\Documents\working-folder\RyzenAI-SW\example\transformers\models\llama2>

Evaluation:

Generally the model is able to answer all the easy HacksterMate benchmark questions. The only issue is with questions that are too vague like 'How to use a breadboard', then the model will hallucinate by adding random circuits that it assumes we want to build.

For answering the hard HacksterMate questions, it becomes more tricky. It still able to answer most questions, but similar to the easy questions, it will start hallucinating when the questions are too vague. When asked about tuning a CNC machine, it starts spitting specifications of a made up CNC machine.

The biggest issue is when the question is not detailed enough, then the model will pretend to be the end user and give a response that looks like a prompt. We can see this when the model answers the question regarding LiDAR, it is asking us back and prompting us back to answer.

#
# Copyright  2023 Advanced Micro Devices, Inc. All rights reserved. 
#

import torch
import logging
import time
import argparse
import os
import psutil
from transformers import set_seed
from transformers import LlamaTokenizer

import qlinear
from utils import Utils
from model_utils import (
    warmup, 
    decode_prompt,
    decode_prompts,
    get_wikitext2,
    perplexity,
)
from profiler import ProfileAIE
import gc

from modeling_llama_amd import LlamaForCausalLM, LlamaAttention

from pre_quant import run_awq, apply_awq
from quantizer import real_quantize_model_weight
from qmodule import WQLinear

set_seed(123)


def load_model(args):
    tokenizer = LlamaTokenizer.from_pretrained("./llama-2-wts-hf/7B_chat")
    if args.awq == "none":
        model = LlamaForCausalLM.from_pretrained("./llama-2-wts-hf/7B_chat", torch_dtype=torch.bfloat16) 
    
    else:
        ckpt = "pytorch_llama27b_w_bit_{}_awq{}_{}amd.pt".format(args.w_bit, "_fa" if args.flash_attention else "", "lm_" if args.lm_head else "")
        if args.task == "quantize":
            model = LlamaForCausalLM.from_pretrained("./llama-2-wts-hf/7B_chat", torch_dtype=torch.bfloat16)
            print(model)
            
            Utils.print_model_size(model)

            q_config = {
                    "zero_point": True,
                    "q_group_size": 128,  } # whether to use group quantization

            if args.awq == 'load':
                print("Loading pre-computed AWQ results from", os.getenv("AWQ_CACHE"))
                awq_results = torch.load(os.getenv("AWQ_CACHE")  + "/llama-2-7b-chat-w%d-g128.pt"%args.w_bit, map_location="cpu")
                apply_awq(model, awq_results)
                print("Quantization config:", q_config)
                real_quantize_model_weight(
                            model, w_bit=args.w_bit, q_config=q_config
                        )

                Utils.print_model_size(model)

                #for n, m in model.named_modules():
                #    if isinstance(m, WQLinear):
                #        print(f"AWQ Model load : {n} : {m.qweight.data.min()}  {m.qweight.data.max()}  {m.qweight.data.shape} {m.scales.shape} qzeros: {m.qzeros.shape} {m.qzeros.min()} {m.qzeros.max()}")

            elif args.awq == 'run':
                awq_results = run_awq(
                        model, tokenizer,
                        w_bit=args.w_bit, q_config=q_config,
                        n_samples=128, seqlen=512,
                    )
                torch.save(awq_results, "./llama-2-7b-chat-w%d-g128-generated.pt"%args.w_bit)
                print(model)
                print("Saved AWQ results in ./llama-2-7b-chat-w%d-g128-generated.pt"%args.w_bit)
                raise SystemExit
            
            if args.flash_attention:
                from llama_flash_attention import LlamaFlashAttention
                node_args = ()
                node_kwargs = {
                    'config': model.config,
                    'llama_name': "llama-2-wts-hf/7B_chat",
                    'flash_config_path': "../../ops/python/llama_flash_attention_config.json",
                    'device': "cpu", # args.target
                    'max_new_tokens': 11,
                    'quant_mode': "awq"
                }
                Utils.replace_node( model,
                                    LlamaAttention,
                                    LlamaFlashAttention,
                                    node_args, node_kwargs)
            
            Utils.replace_node( model, 
                                WQLinear, 
                                qlinear.QLinearPerGrp, 
                                (), {'device':'cpu', 'w_bit':args.w_bit, 'group_size':128} )
            print(model)
            gc.collect()

            Utils.print_model_size(model)
            if args.lm_head: # Quantize lm_head
                Utils.replace_node( model, 
                                    torch.nn.Linear, 
                                    qlinear.QLinearPerGrp, 
                                    (), {'device':'cpu', 'w_bit':args.w_bit, 'group_size':32} )
                print(model)
                gc.collect()

            torch.save(model, ckpt)
            print(f"Quantized and saved model: {ckpt}")
            raise SystemExit
        else:
            print(f"Loading from ckpt: {ckpt}")
            if not os.path.exists(ckpt):
                print(f"\n\n ***** Run --task quantize (with/without lm_head) first to save quantized model ...!!! \n\n")
                raise SystemExit 
            model = torch.load(ckpt)

    Utils.print_model_size(model)
    _ = gc.collect()
    model.eval()
    model = model.to(torch.bfloat16)
    print(model)
    return model, tokenizer 


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument('--dataset', help="Dataset - wikitext2-raw-v1, wikitext2-v1", type=str, default="raw", choices=["non-raw", "raw"])
    parser.add_argument('--w_bit', help="weight bit size", type=int, default=3, choices=[3, 4])
    parser.add_argument('--awq', help="load awq scales, clips from pt or run awq", type=str, default="load", choices=["load", "run", "none"]) 
    parser.add_argument("--target", help="cpu, aie, aie_emu", type=str, default="cpu", choices=["cpu", "aie_emu", "aie"])
    parser.add_argument('--task', help="quantize: Apply AWQ and save ckpt; perplexity: Measure perplexity on wikitext2 dataset; benchmark: Benchmark latency w.r.t prompt length; benchmark_long: Benchmark long sequences (compare with flash attn); decode: Decode set of prompts;", type=str, default="decode", choices=["quantize", "decode", "benchmark", "benchmark_long", "perplexity"] )
    parser.add_argument('--flash_attention', help="Enable flash attention", action='store_true')
    parser.add_argument('--lm_head', help="Enable PerGrp quantization of lm_head layer", action='store_true')
    parser.add_argument('--num_torch_threads', help="Number of torch threads", type=int, default=8, choices=[1, 2, 3, 4, 5, 6, 7, 8])
    args = parser.parse_args()
    print(f"{args}")
    dev = os.getenv("DEVICE")

    if dev == "stx":
        p = psutil.Process()
        p.cpu_affinity([0, 1, 2, 3])
    torch.set_num_threads(args.num_torch_threads)
    
    log_dir = "./logs_awq_7B_chat"
    if not os.path.exists(log_dir):
        os.makedirs(log_dir)
    log_file = log_dir + "/log_awq_7B_chat.log"

    logging.basicConfig(filename=log_file,
                        filemode='w',
                        format='%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
                        datefmt='%H:%M:%S',
                        level=logging.CRITICAL)

    model, tokenizer = load_model(args)

    if args.awq != "none":
        for n, m in model.named_modules():
            if isinstance(m, qlinear.QLinearPerGrp):
                print(f"Preparing weights of layer : {n}")
                m.device = "aie"
                m.quantize_weights()

    print(model)
    Utils.print_model_size(model)
    
    warmup(model, tokenizer)

    if (args.task == "decode"):
        decode_prompts(model, tokenizer, max_new_tokens=3000)
        logging.shutdown()
        out_file = log_file.replace(".log", "_profile.csv")
        out_file = open(out_file, "w")
        ProfileAIE.analyze_profiling(False, True, log_file, out_file)
        out_file.close()

    elif (args.task == "benchmark") or (args.task == "benchmark_long"):
        #print(model.config.max_position_embeddings) # 2048
        trainloader, testenc = get_wikitext2(tokenizer, nsamples=2, seqlen=4096)
        if (args.task == "benchmark"):
            seqlens =  [4, 8, 16, 32, 64, 128, 256]
        else:
            seqlens =  [512, 1024, 1536, 2048, 3000, 4096] 
        input_ids = next(iter(trainloader))[0][:, :4096]
        for seqlen in seqlens:
            logging.critical("*"*40)
            print("*"*40)
            print(f"Benchmarking for {seqlen} tokens ...")
            input_ids_test = input_ids[:, :seqlen]
            decode_prompt(model, tokenizer, prompt=None, input_ids = input_ids_test, max_new_tokens=11)
            
        logging.shutdown()
        out_file = log_file.replace(".log", "_profile.csv")
        out_file = open(out_file, "w")
        ProfileAIE.analyze_profiling(False, True, log_file, out_file)
        out_file.close()

    elif (args.task == "perplexity"):
        start = time.time()
        perplexity(model, tokenizer, dataset=args.dataset)
        print(f"Time taken to measure ppl on RyzenAI: {time.time() - start}s")

#
# Copyright  2023 Advanced Micro Devices, Inc. All rights reserved. 
#

import torch
import logging 
import time 
import random
import numpy as np 

prompts = [ "What are the pros and cons of using a CNC mill vs. a CNC router for metalwork?",             
            "How do I design and implement a PID controller for a homemade CNC machine?",        
            "What are the considerations for selecting the right microcontroller for an IoT project with multiple sensors?",                     
            "How can I optimize a 3D model for printing to reduce material use and printing time?",                
            "What are the best practices for designing PCBs for high-frequency applications?.",                          
            "How do I integrate a LiDAR sensor with an Arduino for a real-time mapping project?",                        
            "What are the benefits and challenges of using FPGAs in complex robotics projects?",                  
            "How do I implement a machine learning algorithm on an edge device for predictive maintenance?",                         
            "What are the key factors in selecting the right type of motor for a precision robotic arm?",  
            "How can I create a secure communication protocol for a DIY home automation system?"                     
            ]

def warmup(model, tokenizer, max_new_tokens=30):
    print(f"Warming up ... ")
    for prompt in prompts[0:1]:
        inputs = tokenizer(prompt, return_tensors="pt") 
        generate_ids = model.generate(inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=max_new_tokens)
        _ = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
    print(f"Warm up DONE!! ")


def decode_prompt(model, tokenizer, prompt, input_ids=None, max_new_tokens=30):
    if input_ids is None:
        print(f"prompt: {prompt}")
        start = time.time()
        inputs = tokenizer(prompt, return_tensors="pt") 
        end = time.time()
        logging.critical(f"[PROFILE][WARMUP] tokenizer: {end-start}")
    else:
        logging.critical(f"[PROFILE][WARMUP] tokenizer: na") # for logging consistency

    start, end = 0, 0
    prompt_tokens = 0
    input_ids_ = input_ids if prompt is None else inputs.input_ids
    attention_mask = torch.ones((1, input_ids.numel())) if prompt is None else inputs.attention_mask
    start = time.time()
    generate_ids = model.generate(input_ids_, attention_mask=attention_mask, max_new_tokens=max_new_tokens)
    end = time.time()
    prompt_tokens = input_ids_.shape[1]
    num_tokens_out = generate_ids.shape[1]
    new_tokens_generated = num_tokens_out - prompt_tokens
    generate_time = (end - start)
    time_per_token = (generate_time/new_tokens_generated)*1e3
    logging.critical(f"[PROFILE][AIE] generate: {generate_time} for {num_tokens_out} tokens; prompt-tokens: {prompt_tokens}; time per generated token: {time_per_token}")

    start = time.time()
    response = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
    end = time.time()
    logging.critical(f"[PROFILE][WARMUP] tokenizer decode: {end-start}")
    
    print(f"response: {response}")
    logging.critical(f"response: {response}")


def decode_prompts(model, tokenizer, max_new_tokens=30):
    for prompt in prompts:
        logging.critical("*"*40)
        print("*"*40)
        decode_prompt(model, tokenizer, prompt, max_new_tokens=max_new_tokens)


def get_wikitext2(tokenizer, dataset="non-raw", nsamples=128, seqlen=2048):
    """ gptq """
    from datasets import load_dataset
    if dataset == "non-raw":
        traindata = load_dataset('wikitext', 'wikitext-2-v1', split='train')
        testdata = load_dataset('wikitext', 'wikitext-2-v1', split='test')
    elif dataset == "raw":
        traindata = load_dataset('wikitext', 'wikitext-2-raw-v1', split='train')
        testdata = load_dataset('wikitext', 'wikitext-2-raw-v1', split='test')
    else:
        raise ValueError(
                "You are using an unsupported dataset, only support wikitext2-raw-v1 and wikitext2-v1."
                "Using wikitext2-raw-v1 with --dataset=raw and wikitext2-v1 with --dataset=non-raw."
            )

    trainenc = tokenizer("\n\n".join(traindata['text']), return_tensors='pt')
    testenc = tokenizer("\n\n".join(testdata['text']), return_tensors='pt')
    dataloader = []
    for _ in range(nsamples):
        i = random.randint(0, testenc.input_ids.shape[1] - seqlen - 1)
        j = i + seqlen
        inp = testenc.input_ids[:, i:j]
        tar = inp.clone()
        tar[:, :-1] = -100
        dataloader.append((inp, tar))
    return dataloader, testenc


def perplexity(model, tokenizer, dataset, framework="pytorch"):
    random.seed(0)
    np.random.seed(0)
    torch.random.manual_seed(0)
    print(f"Calculating Perplexity on wikitext2 test set ...")
    model = model#.cuda()
    dataloader, testenc = get_wikitext2(tokenizer, dataset=dataset)
    
    model.seqlen = 2048
    test_enc = testenc.input_ids
    nsamples = 2 #test_enc.numel() // model.seqlen
    if framework == "pytorch":
        dtype = next(iter(model.parameters())).dtype

    loss = torch.nn.CrossEntropyLoss()
    nlls = []

    with torch.no_grad():
        attention_mask = torch.ones((1, test_enc.numel()))#.cuda()
        for i in range(nsamples):
            batch = test_enc[:, (i * model.seqlen):((i + 1) * model.seqlen)]#.cuda()
            if framework == "pytorch":
                out = model(
                    batch,
                    attention_mask=attention_mask[:, (i * model.seqlen):((i + 1) * model.seqlen)].reshape((1, -1))
                )
            else :
                out = model(
                    batch,
                    attention_mask=batch.new_ones(batch.shape)
                )
            shift_labels = test_enc[
                :, (i * model.seqlen):((i + 1) * model.seqlen)
            ][:, 1:]#.cuda()
            loss_fct = torch.nn.CrossEntropyLoss()
            loss = loss_fct(out.logits[0][:-1, :], shift_labels.view(-1))
            neg_log_likelihood = loss.float() * model.seqlen
            nlls.append(neg_log_likelihood)

        ppl = torch.exp(torch.stack(nlls).sum() / (nsamples * model.seqlen))
        print('Perplexity:', ppl.item())

HacksterMate: Offline AI Assistant for Makers

Things used in this project

Hardware components

Software apps and online services

Story

Problem

Solution

NPU/IPU PC settings and installation

Software environment and dependencies

Preparing environment and packages for Ryzen AI Transformers

Download and test Llama 2

Evaluation:

Code

run_awq.py

model_utils.py

Credits

Samuel Alexander

Comments

Embed the widget on your own site

HacksterMate: Offline AI Assistant for Makers

HacksterMate: Offline AI Assistant for Makers

Things used in this project

Hardware components

Software apps and online services

Story

Problem

Solution

NPU/IPU PC settings and installation

Software environment and dependencies

Preparing environment and packages for Ryzen AI Transformers

Download and test Llama 2

Evaluation:

Code

run_awq.py

model_utils.py

Credits

Samuel Alexander

Comments

Related channels and tags