Achieving plant growth is not a straightforward task, given that every plant has unique care requirements. These needs vary depending on factors such as plant species, temperature, soil moisture, and humidity conditions. For example, Adiantum (Maidenhair fern) requires low light conditions, fertilizer, and low water requirements, while Aglaonema (Chinese Evergreen) prefers more frequent watering and a warmer, tropical climate. With countless plant species available, determining appropriate care can be challenging. However, through the use of Language and Learning Models (LLMs), users can receive personalized recommendations based on their specific plant type and environmental conditions. Real-time data inputs from sensors monitoring the plant's condition can further enhance the accuracy of these solutions.
Most Language and Learning Models (LLMs) are developed and trained using extensive textual datasets, which include a vast amount of information available on the internet. This training process enables the models to select the most relevant information related to a given text, making them more effective in generating accurate and contextually appropriate responses. The ability of LLMs to sift through large volumes of data and extract the most important information is one of their key advantages over other types of machine learning models. As a result, LLMs can provide valuable insights and recommendations based on the specific needs and preferences of individual users.
For the proposed use case, the goal is to obtain precise information regarding a specific dataset focused on plant care knowledge. This can be achieved through two types of solutions:
- Fine-tuning an existing model: This involves adjusting the parameters and weights of an existing LLM to better suit the specific needs of the current use case.
- Retrieval-augmented generation (RAG): With this approach, there is no need to retrain the model to obtain the desired results. Additionally, RAG allows for customization of the context from which information is retrieved. For instance, one can choose to retrieve plant information and specifications directly related to plant care.
The main advantage of using RAG is that it eliminates the need for model retraining while enabling customization of the context for information retrieval.
This makes it an attractive option for applications where real-time data inputs and dynamic contexts are important factors.
For instance, one can utilize data obtained directly from a document specializing in plant information and care instructions.
The following code demonstrates how to utilize Mistral, a Language and Learning Model (LLM), while taking into consideration a PDF document containing plant-related information. This can help the LLM generate more accurate and contextually relevant responses related to plant care and specifications.
from langchain_community.document_loaders import PDFPlumberLoader
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.llms import Ollama
from langchain.prompts import PromptTemplate
from langchain.chains.llm import LLMChain
from langchain.chains.combine_documents.stuff import StuffDocumentsChain
from langchain.chains import RetrievalQA
import time
# Load the PDF
loader = PDFPlumberLoader("PlantCareDocument.pdf")
docs = loader.load()
# Split into chunks
text_splitter = SemanticChunker(HuggingFaceEmbeddings())
documents = text_splitter.split_documents(docs)
# Instantiate the embedding model
embedder = HuggingFaceEmbeddings()
# Create the vector store and fill it with embeddings
vector = FAISS.from_documents(documents, embedder)
retriever = vector.as_retriever(search_type="similarity", search_kwargs={"k": 3})
# Define llm
llm = Ollama(model="mistral")
prompt = """
1. Use the following pieces of context to answer the question at the end.
2. If you don't know the answer, just say that "I don't know" but don't make up an answer on your own.\n
3. Keep the answer crisp and limited to 3,4 sentences.
Context: {context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate.from_template(prompt)
llm_chain = LLMChain(
llm=llm,
prompt=QA_CHAIN_PROMPT,
callbacks=None,
verbose=True)
document_prompt = PromptTemplate(
input_variables=["page_content", "source"],
template="Context:\ncontent:{page_content}\nsource:{source}",
)
combine_documents_chain = StuffDocumentsChain(
llm_chain=llm_chain,
document_variable_name="context",
document_prompt=document_prompt,
callbacks=None,
)
qa = RetrievalQA(
combine_documents_chain=combine_documents_chain,
verbose=True,
retriever=retriever,
return_source_documents=True,
)
# Input
start_time = time.time()
#print(qa("How does plant respond to disease?")["result"])
print(qa("How virus spreads between plants?")["result"])
end_time = time.time()
total_time = end_time - start_time
print(f"The time taken to ask the question is: {total_time} seconds.")
The compares directly the embeddings out from the the LLM model, which contains a vector representation of the answer, with the vector representation of the input document.
AMD consoleAccessing the AMD Instinct MI210 was quite straightforward since they are accessible through the AMD Acceleration Cloud (AAC) and can be easily obtained via a console website. Comprehensive documentation on how to use the MI210 is available in the AMD blog post for further reference.
The code was deployed on both the AMD GPU Instinct MI210, featuring 64GB of memory, and the NVIDIA Tesla V100S, which boasts 32GB of RAM.
The Mistral model was implemented using the Ollama API, which has a version compatible with both AMD and NVIDIA GPUs. The installation of the Ollama API can be accomplished with a single command line input for added convenience:
curl -fsSL https://ollama.com/install.sh | sh
Here are the commands to run ollama API:
ollama serve
ollama pull mistral
ollama run mixtral
After deploying the Ollama API, one can proceed to query the Language and Learning Model (LLM) by executing the Python code shown previously in a separate terminal window.
By issuing multiple requests using different GPUs, it was observed that LLM queries executed on NVIDIA hardware resulted in lower latency compared to those processed on AMD hardware.
ResultsFor instance, if you have a plant and are unsure about how to care for it or what to do when it exhibits certain symptoms, you can simply ask Mistral and receive accurate and reliable information in response. The model's extensive training on large datasets enables it to provide detailed and nuanced answers, taking into account various factors that may affect plant health and growth.
Moreover, the Mistral model is not limited to specific plant species or symptoms, making it a versatile resource for all types of plant enthusiasts. Whether you are a seasoned gardener or a novice, Mistral can help you diagnose issues with your plants, recommend appropriate care strategies, and provide general advice on how to keep your plants healthy and thriving.
In summary, the Mistral model's extensive knowledge base and versatility make it an invaluable tool for anyone looking to learn more about plant care or troubleshoot issues with their plants. By leveraging the power of large datasets, Mistral can provide accurate and reliable information on a wide range of plant-related topics, helping users make informed decisions when it comes to their plant care regimens.
Merging this with RAG capabilities, one can focus the results from specific context. I tried several documents like the Farmers book from the Indian Government.
DiscussionI would like to express my gratitude to the AMD team for providing access to their cloud compute resources, which allowed me to test the capabilities of the AMD Acceleration Cloud. However, there were some limitations during the contest that affected the user experience:
- Lack of Access to AMD Console API: This limitation forced users to rely solely on the web interface to launch instances, making it difficult to automate instance launches.
- Instance Launch Time Limitations: The maximum instance launch time was limited to one hour, which may not be sufficient for long-term or computationally intensive tasks.
To improve the user experience and enable more efficient long-term automation services, it would be helpful if AMD provided infrastructure as a service (IaaS) options and allowed users to leverage ephemeral resources for running lengthy calculations. This approach would provide greater flexibility and convenience for users who require extended compute times and automated launches for their workloads.
Certainly, for the scope of the competition, this was sufficiently adequate to test a use case with the new AMD hardware resources.
Comments