Training your chatbot - basics & limits

Last updated: February 12, 2026

Training is how your chatbot learns about your business. You provide content -- from your website, documents, or custom text -- and ChatLab processes it into a knowledge base the chatbot uses to answer questions.


Where to find training

Select your chatbot from the main dashboard, then click the Training tab in the horizontal navigation bar.

Training tab highlighted in bot navigation


Training sources

The left sidebar in the Training tab lists all available training source categories. Each source type adds content to the same knowledge base for your chatbot.

Training source categories in the left sidebar

Website URLs (Links)

Point your chatbot to your website and it will crawl pages, extract text content, and train on what it finds. You can scan an entire website, a single page, or use your sitemap for precise control. Advanced settings let you filter URLs, exclude page elements, and configure scraping behavior.

Learn more about adding website sources

Files

Upload documents directly to your chatbot's knowledge base. Supported formats: PDF, TXT, DOC, DOCX, CSV, XLS, XLSX. You can upload up to 20 files at once, with a maximum of 20 MB per file. ChatLab extracts the text content and processes it for training.

Learn more about adding files

Questions & Answers

Create Q&A pairs to give your chatbot precise answers to specific questions. This is useful for FAQs, company policies, or any topic where you want a consistent response. You can group related Q&A pairs under labels for easier management.

Learn more about adding questions and answers

Text

Add plain text directly to your chatbot's knowledge base. Use this for quick additions, internal knowledge, or content that does not exist in a file or on your website. Each text entry can be up to 100,000 characters.

Learn more about adding plain text

Corrections

Fine-tune your chatbot by correcting its previous mistakes. You can add corrections from the Chatlogs tab when you spot an incorrect response, or create them manually in the Training section.

Learn more about adding and editing corrections


Training limits

The Limits panel at the bottom of the left sidebar shows your current usage for the selected chatbot.

Limits panel showing training size, links, re-trainings, and credits

What counts as a training character

A training character is one character of visible text extracted from your sources. ChatLab does not count HTML tags, CSS, JavaScript, images, or other media. Only the actual readable content counts toward your limit.

For example, actual text typically makes up only about 8% of a web page's total size:

  • 400,000 characters ≈ 5 MB page
  • 11,000,000 characters ≈ 140 MB page
  • 15,000,000 characters ≈ 190 MB page

If these limits are not sufficient, you can purchase an add-on package: "+10 MB of training characters for all your chatbots". This increases each chatbot's limit (for example, from 15 MB to 25 MB).

Per-bot limits

Each chatbot has its own independent training data limit. One chatbot reaching its limit does not affect your other chatbots. All training sources within a single chatbot (website content, files, Q&A, text, corrections) count toward the same cumulative character total.

Monitoring your usage

The Limits panel displays:

  • Training size -- Characters used vs. your limit (e.g. 30,133 / 15,000,000)
  • Links -- Number of trained URLs vs. your link limit
  • Trainings -- Number of training operations performed
  • Re-trainings -- Automatic re-training operations used vs. monthly limit
  • Credits remaining -- Message credits available for the billing period

For a deeper understanding of how limits work, including per-plan quotas, smart re-training, and optimization tips, see Understanding training data limits.


How training works

When you add a training source, ChatLab processes it in the background:

  1. Content extraction -- Text is extracted from your source (web page, file, or manual input)
  2. Chunking -- The text is split into smaller pieces optimized for retrieval
  3. Embedding generation -- Each chunk is converted into a numerical representation (embedding)
  4. Indexing -- Embeddings are stored in the knowledge base for fast similarity search

When a user asks your chatbot a question, the system searches the knowledge base for the most relevant chunks and uses them to generate an accurate answer.

You can leave the page while training is in progress. Training continues in the background and you can optionally receive an email notification when it completes. Learn more about training in background.


Re-training

Website content changes over time. You can re-train individual pages manually or set up automatic scheduled re-training (daily, weekly, or monthly) to keep your chatbot's knowledge current.

When re-training existing content, ChatLab uses net quota calculation -- only the difference between old and new content counts toward your limit, not the entire page again.

Learn more about re-training


Tips for effective training

  • Quality over quantity -- Focus on content that directly helps answer customer questions (FAQs, product details, support articles, policies)
  • Avoid noise -- Exclude irrelevant pages like login screens, image galleries, or news archives. Use advanced URL filtering when scanning websites
  • Combine sources -- Use website scanning for broad coverage, then supplement with Q&A pairs for topics that need precise answers
  • Review and correct -- Check your chatbot's responses in Chatlogs and add corrections when needed
  • Use the Knowledge Base Optimizer -- This tool analyzes which sources are referenced most, helping you refine your training data

For more strategies, see How to improve chatbot responses.


Related articles