Blog

Top 6 Things I Learned While Building a Domain-Specific Custom AI ChatBot

10.19.2023 | Testing AI chatbot Llama Index Llangchain vector embeddings | Kaylynn Watson

My journey with AI began by exploring how I could leverage ChatGPT to assist with my coding. You can read all about my initial learnings here. Stepping into the world of AI, I quickly grasped the enormous potential of domain specific chatbots. These advanced tools are not just about automating conversations; they open doors to a myriad of applications including revolutionizing customer service, increasing the productivity of developers, and personalizing user experiences to a remarkable degree.

Background

In this project, my team and I crafted a smart AI chatbot, specifically tailored for the Focused Labs website. Our aim extended beyond just devising a simple chat interface; we envisioned a comprehensive “Knowledge Hub” that fully leverages the potential of natural-language models. Such an AI-powered Enterprise Knowledge Hub uniquely integrates diverse information sources into a unified, readily accessible platform. Our goal was to synergize data from our Notion wiki and information from our website, thereby establishing a comprehensive Focused Labs virtual assistant.

So, here, I share the top six insights I uncovered during the creation of a domain-specific custom AI chatbot.

1. Interplay of Multiple AI Models

My key insight is that building a smart AI chatbot is not about relying on a single, all-knowing AI model. Instead, it's like conducting an orchestra - different models collaborate, each playing their unique part to create a symphony of precise responses and meaningful interactions. Our setup comprises three distinct models, each with a specific role. An embedding model processes our proprietary data, a completion model handles the text we retrieve from databases, and a chat model directly engages with the raw input provided by users. Together, these models form a more effective and comprehensive system. For a deeper understanding of this architecture, refer to our blog, Basic Architecture of a Domain Specific Custom AI, that covers this topic.

2. Vector Embeddings: The Heart of Custom Data

Vector embeddings are critical when dealing with custom data. In fact, these embeddings essentially form the backbone for AI models to comprehend proprietary information.

Traditional "training" or fine-tuning of AI models involves a complex and expensive process of exposing the models to vast quantities of domain-specific data. However, this approach doesn't reliably deliver the specific, accurate responses we're seeking.

This is where “prompt engineering” proves its worth. Unlike the hefty requirements of model training, prompt engineering simply involves specific crafting of the input prompts provided to the AI. By providing a well-defined context or prompt, we can guide the model's responses effectively.

Unfortunately, we are limited in the number of words we can ask an AI to ingest at one time. Thus, we are not able to include all of the potential domain-specific context needed for a good quality answer directly in the prompt. So, we pre-organize and store the data in a way that helps the AI model to grasp the meaning and context of words in human language. We are then able to retrieve smaller pieces of relevant information from the data stores and add those specific pieces to a prompt for a language model to process.

These pre-organized data stores are called vector databases. Think of these databases to be like a library, but instead of organizing books by their exact titles or authors, the books are organized based on their style, themes, or the emotions they evoke. The “books” are the raw text together with a computer-readable representation of the data called a vector embedding. Vector embeddings are like detailed book summaries that encapsulate complex descriptions, making it possible for the library to categorize and recommend based on similarities beyond just title or author.

In more technical terms, vector embeddings are lists of numbers that represent real-world concepts. These numbers quantify various characteristics of the data. Then, we can measure the distance between vectors in the vectors space to evaluate similarity. I recommend reading this Pinecone article for a deeper technical definition. (Pinecone is a vector database.)

Vector embeddings with prompt engineering are both simpler and more effective for our use case, helping us to generate accurate responses without the complications and costs associated with traditional model training. To see practical examples of prompt engineering, check out this blog post about Small Steps Toward Effective Prompt Engineering.

3. The Unique Challenge of Testing

Shifting from traditional software testing, the evaluation of an AI system demands a new perspective. Testing involves a more holistic approach rather than segmenting the system into smaller, separated parts. Classic software testing methods, such as unit testing, aren't up to the task. Instead, the testing process more closely resembles baking a cake: the impact of altering a single ingredient can't truly be evaluated until the entire process is complete and the final product is assessed.

Following the time-honored scientific method is a better testing pattern. Create a rubric that records a hypothesis, details the data input, and documents the methods, algorithms, libraries, and tools used. Subsequently, evaluate the responses from the AI system, draw conclusions, and encapsulate the findings.

For our use case, we created 2 sets of 6 questions based on information available on the Focused Labs website and the Focused Labs Notion wiki. We included both specific and broad questions and created an answer key. We ranked each answer from our chatbot on a scale of 1-5 for correctness.

When evaluating your AI system, be prepared for it to sometimes give partially correct answers. For instance, we asked “What are the Focused Labs’ values?” The chatbot correctly answered “Listen First”, “Learn Why” and “Love Your Craft.” However, the model also returned two additional concepts as values. We’d rate an answer like this a 4. The LLM had the right answer, but it gave additional information.

We would then make small iterations in our implementation. For example, we removed all emojis from our text, and then asked our chatbot these same 6 questions and compared the results.

While this method is repetitive and monotonous at times, it clearly tests the effectiveness of leveraging a LLM in bespoke code.

4. Think Less Like a Programmer and More Like a Linguist

To improve accuracy, efficiency, reliability, and quality of insights, we explored various data cleansing techniques. One specific technique includes normalizing the data by converting all words to lowercase. After presenting our routine set of questions to the AI virtual assistant and comparing results with previous experiments, we were surprised by the outcome. This seemingly insignificant change leads to a substantial decrease in the AI's performance. This unexpected result pushes us deeper into the nuanced complexities of data interpretation, prompting a shift in perspective.

Instead of adhering to a traditional programmer’s perspective that perceives data as mere strings of characters or simply 0s and 1s, learn to value the significance of elements like casing, punctuation, and even emojis. These might previously have been dismissed as trivial, but they play crucial roles in communication. Embrace a linguist's mindset and critically evaluate which parts of the data contribute to the conveyance of ideas and thoughts.

Pivot towards techniques that promote a stronger connection between ideas. Remember, data cleaning is not a mundane process of managing uniform bytes. Rather, it's an opportunity to enrich the ideas and the meaning inherent in the information, ultimately enhancing the accuracy and effectiveness of your AI chatbot. Treat data not merely as raw material, but as a rich medium for nuanced expression and connection.

5. Harnessing the Power of Langchain/Llama Index

Langchain and Llama Index are the leading tools for leveraging Large Language Models (LLMs). They are frameworks allowing language models to link with diverse data sources and engage with a multitude of tools or resources. Lanchain and Llama Index offer modular and easy-to-use components with off-the-shelf configuration.

To dive deeper into one of these technical components, Langchain agents-with-tools are instrumental for grappling with a variety of question types. These agents provide the flexibility to handle a diverse range of queries, thereby augmenting the chatbot's versatility and enriching user experience.

During configuration of these agents, a key piece of advice is to pay close attention to the descriptions of your tools. These descriptions significantly impact your chatbot's behavior and subsequently, its effectiveness. Include details for each tool's intended use.

For instance, in our initial setup, the descriptions for our two data sources were too brief, which led to frequent misapplications of the tools and consequently, incorrect answers from the chatbot.

Inadequate Tool Dependencies

Notion Data Source Description: "Focused Labs internal knowledge from Notion."
Website Data Source Description: "Focused Labs knowledge scraped from website."

Detailed Tool Descriptions

Notion Data Source Description: 
"Focused Labs internal knowledge from Notion. It contains information about the Denver office, the Chicago office, lightning talks, IRLs, pairing, company strategy, tech leads, company purpose, stand up, pair retros, project rotations, the TPI, anchors, contact information, software development, the 2023 strategy."

Website Data Source Description: 
"Focused Labs public knowledge scraped from the website. It contains information about case studies, agile methodologies, company employees, contact information, company values, and general information about the company."

6. AI: A Lightning-Fast Landscape

Last but certainly not least, we've been awed by the staggering pace at which the AI field evolves. If a tool you seek isn't available today, chances are it'll appear in a week. Keep in mind that many libraries are months, if not weeks old and are rapidly evolving themselves. The lightning-fast pace can be overwhelming, yet simultaneously thrilling. Sourcing reliable information can be daunting especially when Stack Overflow questions are unanswered, but we found Discord servers (such as the Langchain Discord and and Llama Index Discord) and GitHub repositories to be excellent platforms for staying updated and actively involved in the AI community.

Embarking on the journey to build a domain-specific AI chatbot is a path of continuous learning. From grasping the nuances of vector embeddings to mastering the unique challenge of testing, each step offers valuable lessons. Whether you're initiating or continuing your journey in AI development, these insights should serve as a guide and ignite your motivation. Keep in mind, the AI field is evolving at breakneck speed, so stay curious, keep experimenting, and remain connected with the vibrant AI community.

What’s Next

If you have an AI development project and would like some expert help from our Focused Labs consultants, complete our Contact Us form and we will have a human chat.

06.13.2023 | Mobile App Development | Focused Labs

Top 6 Things I Learned While Building a Domain-Specific Custom AI ChatBot

Background

1. Interplay of Multiple AI Models

2. Vector Embeddings: The Heart of Custom Data

3. The Unique Challenge of Testing

4. Think Less Like a Programmer and More Like a Linguist

5. Harnessing the Power of Langchain/Llama Index

6. AI: A Lightning-Fast Landscape

What’s Next

Related Posts

Strengthening Customer Loyalty with a Digitized Dining Experience

6 Tips to Improving Your ChatGPT Coding Experience

OpenAI Assistants: Limited, but Incredible

Basic Architecture of a Domain Specific Custom AI Chatbot

What To Do When LLMs Hallucinate

Paying It Forward, One Donation at a Time

Small Steps Towards Effective Prompt Engineering