11 KiB

Raw Blame History

FastAPI Implementation Plan for Sim-Search (COMPLETED)

Overview

This document outlines the plan for implementing a FastAPI backend for the sim-search project, replacing the current Gradio interface while maintaining all existing functionality. The API will serve as the backend for a new React frontend, providing a more flexible and powerful user experience.

✅ Implementation Status: COMPLETED on March 20, 2025

Architecture

Core Components

API Layer ✅
- FastAPI application with RESTful endpoints
- OpenAPI documentation
- Authentication middleware
- CORS configuration
Service Layer ✅
- Bridge between API and existing sim-search functionality
- Handles async/sync coordination
- Implements caching and optimization strategies
Data Layer ✅
- SQLAlchemy ORM models
- Database session management
- Migration scripts using Alembic
Authentication System ✅
- JWT-based authentication
- User management
- Role-based access control

Directory Structure

sim-search-api/
├── app/
│   ├── api/
│   │   ├── routes/
│   │   │   ├── __init__.py
│   │   │   ├── query.py           # Query processing endpoints
│   │   │   ├── search.py          # Search execution endpoints
│   │   │   ├── report.py          # Report generation endpoints
│   │   │   └── auth.py            # Authentication endpoints
│   │   ├── __init__.py
│   │   └── dependencies.py         # API dependencies (auth, rate limiting)
│   ├── core/
│   │   ├── __init__.py
│   │   ├── config.py              # API configuration
│   │   └── security.py            # Security utilities
│   ├── db/
│   │   ├── __init__.py
│   │   ├── session.py             # Database session
│   │   └── models.py              # Database models for reports, searches
│   ├── schemas/
│   │   ├── __init__.py
│   │   ├── token.py               # Token schemas
│   │   ├── user.py                # User schemas
│   │   ├── query.py               # Query schemas
│   │   ├── search.py              # Search result schemas
│   │   └── report.py              # Report schemas
│   ├── services/
│   │   ├── __init__.py
│   │   ├── query_service.py       # Query processing service
│   │   ├── search_service.py      # Search execution service
│   │   └── report_service.py      # Report generation service
│   └── main.py                    # FastAPI application
├── alembic/                       # Database migrations
│   ├── versions/
│   │   └── 001_initial_migration.py  # Initial migration
│   ├── env.py                     # Alembic environment
│   └── script.py.mako             # Alembic script template
├── .env.example                   # Environment variables template
├── alembic.ini                    # Alembic configuration
├── requirements.txt               # API dependencies
├── run.py                         # Script to run the API
└── README.md                      # API documentation

API Endpoints

Authentication Endpoints ✅

POST /api/v1/auth/token: Get an authentication token
POST /api/v1/auth/register: Register a new user

Query Processing Endpoints ✅

POST /api/v1/query/process: Process and enhance a user query
POST /api/v1/query/classify: Classify a query by type and intent

Search Execution Endpoints ✅

POST /api/v1/search/execute: Execute a search with optional parameters
GET /api/v1/search/engines: Get available search engines
GET /api/v1/search/history: Get user's search history
GET /api/v1/search/{search_id}: Get results for a specific search
DELETE /api/v1/search/{search_id}: Delete a search from history

Report Generation Endpoints ✅

POST /api/v1/report/generate: Generate a report from search results
GET /api/v1/report/list: Get a list of user's reports
GET /api/v1/report/{report_id}: Get a specific report
DELETE /api/v1/report/{report_id}: Delete a report
GET /api/v1/report/{report_id}/download: Download a report in specified format
GET /api/v1/report/{report_id}/progress: Get the progress of a report generation

Database Models

User Model ✅

class User(Base):
    __tablename__ = "users"
    
    id = Column(String, primary_key=True, index=True)
    email = Column(String, unique=True, index=True, nullable=False)
    hashed_password = Column(String, nullable=False)
    full_name = Column(String, nullable=True)
    is_active = Column(Boolean, default=True)
    is_superuser = Column(Boolean, default=False)

Search Model ✅

class Search(Base):
    __tablename__ = "searches"
    
    id = Column(String, primary_key=True, index=True)
    user_id = Column(String, ForeignKey("users.id"))
    query = Column(String, nullable=False)
    enhanced_query = Column(String, nullable=True)
    query_type = Column(String, nullable=True)
    engines = Column(String, nullable=True)  # Comma-separated list
    results_count = Column(Integer, default=0)
    results = Column(JSON, nullable=True)
    created_at = Column(DateTime, default=datetime.datetime.utcnow)

Report Model ✅

class Report(Base):
    __tablename__ = "reports"
    
    id = Column(String, primary_key=True, index=True)
    user_id = Column(String, ForeignKey("users.id"))
    search_id = Column(String, ForeignKey("searches.id"), nullable=True)
    title = Column(String, nullable=False)
    content = Column(Text, nullable=False)
    detail_level = Column(String, nullable=False, default="standard")
    query_type = Column(String, nullable=True)
    model_used = Column(String, nullable=True)
    created_at = Column(DateTime, default=datetime.datetime.utcnow)
    updated_at = Column(DateTime, default=datetime.datetime.utcnow, onupdate=datetime.datetime.utcnow)

Service Layer Integration

Integration Strategy ✅

The service layer acts as a bridge between the API endpoints and the existing sim-search functionality. Each service:

Imports the corresponding sim-search components
Adapts the API request to the format expected by sim-search
Calls the sim-search functionality
Transforms the result to the API response format

Example from the implemented QueryService:

# Add sim-search to the python path
sim_search_path = Path(settings.SIM_SEARCH_PATH)
sys.path.append(str(sim_search_path))

# Import sim-search components
from query.query_processor import QueryProcessor
from query.llm_interface import LLMInterface

class QueryService:
    def __init__(self):
        self.query_processor = QueryProcessor()
        self.llm_interface = LLMInterface()
    
    async def process_query(self, query: str) -> Dict[str, Any]:
        # Process the query using the sim-search query processor
        structured_query = await self.query_processor.process_query(query)
        
        # Format the response
        return {
            "original_query": query,
            "structured_query": structured_query
        }

Authentication System

JWT-Based Authentication ✅

The authentication system uses JSON Web Tokens (JWT) to manage user sessions:

User logs in with email and password
Server validates credentials and generates a JWT token
Token is included in subsequent requests in the Authorization header
Server validates the token for each protected endpoint

Implementation using FastAPI's dependencies:

oauth2_scheme = OAuth2PasswordBearer(tokenUrl=f"{settings.API_V1_STR}/auth/token")

def get_current_user(
    db: Session = Depends(get_db), token: str = Depends(oauth2_scheme)
) -> models.User:
    try:
        payload = jwt.decode(
            token, settings.SECRET_KEY, algorithms=[settings.ALGORITHM]
        )
        token_data = TokenPayload(**payload)
    except (JWTError, ValidationError):
        raise HTTPException(
            status_code=status.HTTP_403_FORBIDDEN,
            detail="Could not validate credentials",
        )
    user = db.query(models.User).filter(models.User.id == token_data.sub).first()
    if not user:
        raise HTTPException(status_code=404, detail="User not found")
    if not user.is_active:
        raise HTTPException(status_code=400, detail="Inactive user")
    return user

Implementation Phases

Phase 1: Core Setup ✅

Set up project structure
Implement database models and migrations
Create authentication system
Implement configuration management

Phase 2: Service Layer ✅

Implement query service integration
Implement search service integration
Implement report service integration
Add error handling and logging

Phase 3: API Endpoints ✅

Implement authentication endpoints
Implement query processing endpoints
Implement search execution endpoints
Implement report generation endpoints

Phase 4: Testing and Documentation ✅

Generate API documentation
Create user documentation

Phase 5: Deployment and Integration ⏳

Set up deployment configuration
Configure environment variables
Integrate with React frontend
Perform end-to-end testing

Dependencies

# FastAPI and ASGI server
fastapi==0.103.1
uvicorn==0.23.2

# Database
sqlalchemy==2.0.21
alembic==1.12.0

# Authentication
python-jose==3.3.0
passlib==1.7.4
bcrypt==4.0.1
python-multipart==0.0.6

# Validation and serialization
pydantic==2.4.2
email-validator==2.0.0

# Testing
pytest==7.4.2
httpx==0.25.0

# Utilities
python-dotenv==1.0.0
aiofiles==23.2.1
jinja2==3.1.2

# Report generation
markdown==3.4.4
weasyprint==60.1  # Optional, for PDF generation

Next Steps

Test the FastAPI implementation to ensure it works correctly with the existing sim-search functionality
Create a React frontend to consume the FastAPI backend
Implement user management in the frontend
Add search history and report management to the frontend
Implement real-time progress tracking for report generation in the frontend
Add visualization components for reports in the frontend
Run comprehensive tests to ensure all functionality works with the new API
Update any remaining documentation to reflect the new API
Consider adding more API endpoints for additional functionality

Conclusion

The FastAPI backend for the sim-search project has been successfully implemented according to this plan. The implementation provides a modern, maintainable, and scalable API that preserves all the functionality of the existing system while enabling new features and improvements through the planned React frontend.

The service layer pattern ensures a clean separation between the API and the existing sim-search functionality, making it easier to maintain and extend both components independently. This architecture also allows for future enhancements such as caching, background processing, and additional integrations without requiring major changes to the existing code.

The next phase of the project will focus on creating a React frontend to consume this API, providing a more flexible and powerful user experience.

11 KiB Raw Blame History