Troubleshooting#

Common issues and solutions for euclid_rag installation, configuration, and usage.

Installation Issues#

Package Installation Failures#

Problem: Import errors after installation.

Solutions:

# Check if package is properly installed
import sys

print(sys.path)

# Verify installation location
import euclid

print(euclid.__file__)

# Reinstall in development mode
pip install -e .

Docker Issues#

Problem: docker compose up fails to start services.

Solutions:

# Check Docker daemon is running
docker --version
docker-compose --version

# Rebuild containers completely
docker compose down --volumes
docker compose build --no-cache
docker compose up

# Check container logs
docker compose logs euclid_rag
docker compose logs ollama

Problem: Cannot pull Ollama models.

Solutions:

# Check if Ollama container is running
docker ps | grep ollama

# Try different model names
docker exec -it euclid_rag-ollama-1 ollama pull mistral:7b
docker exec -it euclid_rag-ollama-1 ollama pull llama2:7b-chat

# Check available space
docker system df

Configuration Issues#

Vector Store Missing Error#

Problem: .. code-block:: text

RuntimeError: Vector store missing. Please run ingestion before launching the app.

Solutions:

Run ingestion first:

python python/euclid/rag/ingestion/ingest_publications.py
python python/euclid/rag/ingestion/ingest_dpdd.py --config python/euclid/rag/app_config.yaml

Check vector store paths:

import os
import yaml

with open("python/euclid/rag/app_config.yaml", "r") as f:
    config = yaml.safe_load(f)

for key, path in config["vector_store"].items():
    if "index_dir" in key:
        print(f"{key}: {path} - Exists: {os.path.exists(path)}")

Create missing directories:

mkdir -p redmine_vector_store public_data_vector_store

YAML Configuration Errors#

Problem: yaml.parser.ParserError or similar YAML parsing errors.

Solutions:

# Validate YAML syntax
python -c "
import yaml
with open('python/euclid/rag/app_config.yaml', 'r') as f:
    try:
        config = yaml.safe_load(f)
        print('✓ YAML is valid')
    except yaml.YAMLError as e:
        print(f'✗ YAML error: {e}')
"

# Check indentation (use spaces, not tabs)
cat -A python/euclid/rag/app_config.yaml

Path Resolution Issues#

Problem: Configuration files or vector stores not found.

Solutions:

# Use absolute paths in configuration
pwd  # Note current directory

# Update config with full paths
/full/path/to/vector_store

# Or ensure you're running from correct directory
cd /path/to/euclid_rag
python python/euclid/rag/app.py

Ingestion Issues#

Network Connection Problems#

Problem: DPDD ingestion fails with connection timeouts or HTTP errors.

Solutions:

# Test network connectivity
curl -I https://euclid.esac.esa.int/dr/q1/dpdd/

# Use VPN if behind corporate firewall
# Check proxy settings
export HTTP_PROXY=http://your-proxy:port
export HTTPS_PROXY=http://your-proxy:port

# Retry with longer timeout
python python/euclid/rag/ingestion/ingest_dpdd.py --config config.yaml --timeout 60

Memory Issues During Ingestion#

Problem: Out of memory errors during document processing.

Solutions:

# Monitor memory usage
top -p $(pgrep -f ingest_dpdd)

# Reduce batch size in configuration
# Process documents in smaller chunks
# Close other applications to free memory

# For very large datasets, use streaming processing
ulimit -v 4000000  # Limit virtual memory

Problem: Ingestion takes extremely long time.

Solutions:

# Limit topics in dpdd_ingest_config.yaml
topics_number_limit: 5  # Start small

# Or process specific topics only
scrape_all: false
topics:
  - name: Purpose and Scope
    link: purpose.html

Permission and Storage Issues#

Problem: Permission denied when writing vector stores.

Solutions:

# Check directory permissions
ls -la vector_store_directory/

# Fix permissions
chmod 755 vector_store_directory/
chown $USER:$USER vector_store_directory/

# Use a directory you own
mkdir ~/euclid_vector_stores
# Update config to point to this directory

Problem: Insufficient disk space for vector stores.

Solutions:

# Check available space
df -h

# Clean up old vector stores
rm -rf old_vector_store_*

# Use external storage
ln -s /external/storage/path vector_store_directory

Runtime Issues#

Slow Query Responses#

Problem: Chatbot responses take very long time.

Solutions:

Check system resources:

# Monitor CPU and memory
htop

# Check disk I/O
iotop

Optimize vector store:

# Re-index with better parameters
# Reduce chunk size in ingestion
# Use faster embedding models

Use lighter LLM models:

# Switch to smaller model
docker exec -it euclid_rag-ollama-1 ollama pull mistral:7b

Inaccurate or Irrelevant Responses#

Problem: Chatbot provides poor quality answers.

Solutions:

Check ingested content quality:

# Inspect vector store contents
from euclid.rag import chatbot

retriever = chatbot.configure_retriever()

# Test with known queries
results = retriever.get_relevant_documents("test query")
for doc in results[:3]:
    print(f"Content: {doc.page_content[:200]}...")
    print(f"Source: {doc.metadata}")

Adjust similarity thresholds:

# In configuration, adjust retrieval parameters
retrieval:
  similarity_threshold: 0.7  # Higher = more strict
  max_results: 5

Re-run ingestion with better filtering:

# Add more banned sections
banned_sections:
  names:
    - Header
    - Navigation
    - Footer
    - Table of Contents

Streamlit Interface Issues#

Problem: Web interface not accessible or loads incorrectly.

Solutions:

# Check if Streamlit is running
ps aux | grep streamlit

# Check port availability
netstat -tlnp | grep 8501

# Try different port
streamlit run rag/app.py --server.port 8080

# Clear browser cache and cookies
# Try in incognito/private mode

Problem: Session state issues or interface behaves unexpectedly.

Solutions:

# Restart Streamlit
pkill -f streamlit
streamlit run rag/app.py

# Clear Streamlit cache
rm -rf ~/.streamlit/

# Check for conflicting browser extensions

LLM Model Issues#

Model Loading Failures#

Problem: Ollama fails to load or run models.

Solutions:

# Check available models
docker exec -it euclid_rag-ollama-1 ollama list

# Check model file integrity
docker exec -it euclid_rag-ollama-1 ollama show mistral:latest

# Re-pull corrupted models
docker exec -it euclid_rag-ollama-1 ollama rm mistral:latest
docker exec -it euclid_rag-ollama-1 ollama pull mistral:latest

Problem: Model responses are gibberish or inappropriate.

Solutions:

# Try different model versions
docker exec -it euclid_rag-ollama-1 ollama pull mistral:7b-instruct-v0.2

# Check model temperature settings
# Verify prompt templates are correct

Memory Issues with LLM#

Problem: Out of memory errors when running LLM inference.

Solutions:

# Use smaller models
docker exec -it euclid_rag-ollama-1 ollama pull mistral:7b  # Instead of larger variants

# Increase Docker memory limits
# In Docker Desktop: Settings > Resources > Memory

# Monitor memory usage
docker stats euclid_rag-ollama-1

Development and Debug Issues#

Import Errors in Development#

Problem: Cannot import euclid modules during development.

Solutions:

# Ensure you're in the right directory
pwd
ls python/euclid/

# Install in development mode
pip install -e .

# Add to PYTHONPATH
export PYTHONPATH="${PYTHONPATH}:$(pwd)/python"

Problem: Changes not reflected when testing.

Solutions:

# For Python code changes
pip install -e .  # Ensure editable install

# For Streamlit, restart the server
# Streamlit should auto-reload, but may need manual restart

Debug Mode and Logging#

Enable detailed logging for troubleshooting:

# Set debug environment variables
export EUCLID_RAG_DEBUG=true
export STREAMLIT_LOGGER_LEVEL=debug

# Run with verbose output
streamlit run rag/app.py --logger.level debug

# Add debug prints in Python code
import logging

logging.basicConfig(level=logging.DEBUG)

# In your code
logging.debug(f"Variable value: {variable}")

Getting Help#

Log Collection#

When reporting issues, collect relevant logs:

# Application logs
tail -n 100 ~/.streamlit/logs/streamlit.log

# Docker logs
docker compose logs --tail=100 euclid_rag
docker compose logs --tail=100 ollama

# System information
python --version
pip list | grep -E "(streamlit|langchain|faiss)"

Documentation and Support#

API Documentation: Python API Reference
Developer Guide: Contributing
GitHub Issues: jeipollack/euclid_rag#issues
Configuration Examples: Check the repository’s examples/ directory

Performance Benchmarking#

Test System Performance#

# Simple performance test
import time
from euclid.rag import chatbot

start_time = time.time()
retriever = chatbot.configure_retriever()
setup_time = time.time() - start_time

start_time = time.time()
results = retriever.get_relevant_documents("test query")
query_time = time.time() - start_time

print(f"Setup time: {setup_time:.2f}s")
print(f"Query time: {query_time:.2f}s")
print(f"Results found: {len(results)}")

Expected Performance#

Typical performance benchmarks:

Vector store loading: < 5 seconds
Simple queries: < 2 seconds
Complex queries: < 10 seconds
Memory usage: 500MB - 2GB depending on data size

If performance is significantly worse, check:

Available system memory
Storage type (SSD vs HDD)
Network connectivity
Model size and complexity

Recovery Procedures#

Complete Reset#

If all else fails, perform a complete reset:

# Stop all services
docker compose down --volumes
pkill -f streamlit

# Remove vector stores
rm -rf *_vector_store/

# Clean Python cache
find . -name "*.pyc" -delete
find . -name "__pycache__" -delete

# Reinstall dependencies
pip uninstall euclid_rag -y
pip install euclid_rag

# Re-run ingestion
python python/euclid/rag/ingestion/ingest_publications.py
python python/euclid/rag/ingestion/ingest_dpdd.py --config python/euclid/rag/app_config.yaml

Backup and Restore#

Backup your vector stores:

# Backup vector stores
tar -czf euclid_vector_stores_backup.tar.gz *_vector_store/

# Backup configuration
cp python/euclid/rag/app_config.yaml app_config_backup.yaml

Restore from backup:

# Restore vector stores
tar -xzf euclid_vector_stores_backup.tar.gz

# Restore configuration
cp app_config_backup.yaml python/euclid/rag/app_config.yaml

Still Having Issues?#

If you’re still experiencing problems after trying these solutions:

Check the GitHub issues: jeipollack/euclid_rag#issues
Create a new issue with: - Your operating system and Python version - Complete error messages - Steps to reproduce the problem - Relevant log files

Include system information:

# System info script
echo "OS: $(uname -a)"
echo "Python: $(python --version)"
echo "Pip packages:"
pip list | grep -E "(euclid|streamlit|langchain|faiss)"
echo "Docker: $(docker --version)"

Troubleshooting#

Installation Issues#

Package Installation Failures#

Docker Issues#

Configuration Issues#

Vector Store Missing Error#

YAML Configuration Errors#

Path Resolution Issues#

Ingestion Issues#

Network Connection Problems#

Memory Issues During Ingestion#

Permission and Storage Issues#

Runtime Issues#

Slow Query Responses#

Inaccurate or Irrelevant Responses#

Streamlit Interface Issues#

LLM Model Issues#

Model Loading Failures#

Memory Issues with LLM#

Development and Debug Issues#

Import Errors in Development#

Debug Mode and Logging#

Getting Help#

Log Collection#

Documentation and Support#

Performance Benchmarking#

Test System Performance#

Expected Performance#

Recovery Procedures#

Complete Reset#

Backup and Restore#

Still Having Issues?#

This Page