ira/.note/api_specification.md

14 KiB

Sim-Search API Specification

This document provides a comprehensive guide for frontend developers to integrate with the Sim-Search API. The API offers intelligent research capabilities, including query processing, search execution across multiple engines, and report generation.

API Base URL

/api/v1

Authentication

The API uses OAuth2 with Bearer token authentication. All API endpoints except for authentication endpoints require a valid Bearer token.

Register a New User

POST /api/v1/auth/register

Register a new user account.

Request Body:

{
  "email": "user@example.com",
  "password": "password123",
  "full_name": "User Name",
  "is_active": true,
  "is_superuser": false
}

Response (200 OK):

{
  "id": "user-uuid",
  "email": "user@example.com",
  "full_name": "User Name",
  "is_active": true,
  "is_superuser": false
}

Login to Get Access Token

POST /api/v1/auth/token

Obtain an access token for API authentication.

Request Body (form data):

username=user@example.com
password=password123

Response (200 OK):

{
  "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...",
  "token_type": "bearer"
}

Query Processing

Process a Query

POST /api/v1/query/process

Process a search query to enhance and structure it for better search results.

Headers:

  • Authorization: Bearer {access_token}

Request Body:

{
  "query": "What are the latest advancements in quantum computing?"
}

Response (200 OK):

{
  "original_query": "What are the latest advancements in quantum computing?",
  "structured_query": {
    "original_query": "What are the latest advancements in quantum computing?",
    "enhanced_query": "What are the recent breakthroughs and developments in quantum computing technology, algorithms, and applications in the past 2 years?",
    "type": "exploratory",
    "intent": "research",
    "domain": "academic",
    "confidence": 0.95,
    "reasoning": "This query is asking about recent developments in a scientific field, which is typical of academic research.",
    "entities": ["quantum computing", "advancements"],
    "sub_questions": [
      {
        "sub_question": "What are the latest hardware advancements in quantum computing?",
        "aspect": "hardware",
        "priority": 0.9
      },
      {
        "sub_question": "What are the recent algorithmic breakthroughs in quantum computing?",
        "aspect": "algorithms",
        "priority": 0.8
      }
    ],
    "search_queries": {
      "google": "latest advancements in quantum computing 2024",
      "scholar": "recent quantum computing breakthroughs",
      "arxiv": "quantum computing hardware algorithms"
    },
    "is_academic": true,
    "is_code": false,
    "is_current_events": false
  }
}

Classify a Query

POST /api/v1/query/classify

Classify a query by type and intent.

Headers:

  • Authorization: Bearer {access_token}

Request Body:

{
  "query": "What are the latest advancements in quantum computing?"
}

Response (200 OK):

{
  "original_query": "What are the latest advancements in quantum computing?",
  "structured_query": {
    "original_query": "What are the latest advancements in quantum computing?",
    "type": "exploratory",
    "domain": "academic",
    "confidence": 0.95
  }
}

Search Execution

Get Available Search Engines

GET /api/v1/search/engines

Get a list of available search engines.

Headers:

  • Authorization: Bearer {access_token}

Response (200 OK):

["google", "arxiv", "scholar", "news", "openalex", "core", "github", "stackexchange"]
POST /api/v1/search/execute

Execute a search with the given parameters.

Headers:

  • Authorization: Bearer {access_token}

Request Body:

{
  "structured_query": {
    "original_query": "What are the environmental impacts of electric vehicles?",
    "enhanced_query": "What are the environmental impacts of electric vehicles?",
    "type": "factual",
    "domain": "environmental"
  },
  "search_engines": ["google", "arxiv"],
  "num_results": 5,
  "timeout": 30
}

Response (200 OK):

{
  "search_id": "search-uuid",
  "query": "What are the environmental impacts of electric vehicles?",
  "enhanced_query": "What are the environmental impacts of electric vehicles?",
  "results": {
    "google": [
      {
        "title": "Environmental Impacts of Electric Vehicles",
        "url": "https://example.com/article1",
        "snippet": "Electric vehicles have several environmental impacts including...",
        "source": "google",
        "score": 0.95
      }
    ],
    "arxiv": [
      {
        "title": "Lifecycle Analysis of Electric Vehicle Environmental Impact",
        "url": "http://arxiv.org/abs/paper123",
        "pdf_url": "http://arxiv.org/pdf/paper123",
        "snippet": "This paper analyzes the complete lifecycle environmental impact of electric vehicles...",
        "source": "arxiv",
        "authors": ["Researcher Name1", "Researcher Name2"],
        "arxiv_id": "paper123",
        "categories": ["cs.CY", "eess.SY"],
        "published_date": "2023-01-15T10:30:00Z",
        "score": 0.92
      }
    ]
  },
  "total_results": 2,
  "execution_time": 1.25,
  "timestamp": "2024-03-20T14:25:30Z"
}

Get Search History

GET /api/v1/search/history

Get the user's search history.

Headers:

  • Authorization: Bearer {access_token}

Query Parameters:

  • skip (optional, default: 0): Number of records to skip
  • limit (optional, default: 100): Maximum number of records to return

Response (200 OK):

{
  "searches": [
    {
      "id": "search-uuid",
      "query": "What are the environmental impacts of electric vehicles?",
      "enhanced_query": "What are the environmental impacts of electric vehicles?",
      "query_type": "factual",
      "engines": "google,arxiv",
      "results_count": 10,
      "created_at": "2024-03-20T14:25:30Z"
    }
  ],
  "total": 1
}

Get Search Results

GET /api/v1/search/{search_id}

Get results for a specific search.

Headers:

  • Authorization: Bearer {access_token}

Path Parameters:

  • search_id: ID of the search

Response (200 OK):

{
  "search_id": "search-uuid",
  "query": "What are the environmental impacts of electric vehicles?",
  "enhanced_query": "What are the environmental impacts of electric vehicles?",
  "results": {
    "google": [
      {
        "title": "Environmental Impacts of Electric Vehicles",
        "url": "https://example.com/article1",
        "snippet": "Electric vehicles have several environmental impacts including...",
        "source": "google",
        "score": 0.95
      }
    ],
    "arxiv": [
      {
        "title": "Lifecycle Analysis of Electric Vehicle Environmental Impact",
        "url": "http://arxiv.org/abs/paper123",
        "pdf_url": "http://arxiv.org/pdf/paper123",
        "snippet": "This paper analyzes the complete lifecycle environmental impact of electric vehicles...",
        "source": "arxiv",
        "authors": ["Researcher Name1", "Researcher Name2"],
        "arxiv_id": "paper123",
        "categories": ["cs.CY", "eess.SY"],
        "published_date": "2023-01-15T10:30:00Z",
        "score": 0.92
      }
    ]
  },
  "total_results": 2,
  "execution_time": 0.0
}
DELETE /api/v1/search/{search_id}

Delete a search from history.

Headers:

  • Authorization: Bearer {access_token}

Path Parameters:

  • search_id: ID of the search to delete

Response (204 No Content)

Report Generation

Generate a Report

POST /api/v1/report/generate

Generate a report from search results.

Headers:

  • Authorization: Bearer {access_token}

Request Body:

{
  "search_id": "search-uuid",
  "query": "What are the environmental impacts of electric vehicles?",
  "detail_level": "standard",
  "query_type": "comparative",
  "model": "llama-3.1-8b-instant",
  "title": "Environmental Impacts of Electric Vehicles"
}

Response (200 OK):

{
  "id": "report-uuid",
  "user_id": "user-uuid",
  "search_id": "search-uuid",
  "title": "Environmental Impacts of Electric Vehicles",
  "content": "Report generation in progress...",
  "detail_level": "standard",
  "query_type": "comparative",
  "model_used": "llama-3.1-8b-instant",
  "created_at": "2024-03-20T14:30:00Z",
  "updated_at": "2024-03-20T14:30:00Z"
}

Get Report Generation Progress

GET /api/v1/report/{report_id}/progress

Get the progress of a report generation.

Headers:

  • Authorization: Bearer {access_token}

Path Parameters:

  • report_id: ID of the report

Response (200 OK):

{
  "report_id": "report-uuid",
  "progress": 0.75,
  "status": "Processing chunk 3/4...",
  "current_chunk": 3,
  "total_chunks": 4,
  "current_report": "The environmental impacts of electric vehicles include..."
}

Get Report List

GET /api/v1/report/list

Get a list of user's reports.

Headers:

  • Authorization: Bearer {access_token}

Query Parameters:

  • skip (optional, default: 0): Number of records to skip
  • limit (optional, default: 100): Maximum number of records to return

Response (200 OK):

{
  "reports": [
    {
      "id": "report-uuid",
      "user_id": "user-uuid",
      "search_id": "search-uuid",
      "title": "Environmental Impacts of Electric Vehicles",
      "content": "# Environmental Impacts of Electric Vehicles\n\n## Introduction\n\nElectric vehicles (EVs) have gained popularity...",
      "detail_level": "standard",
      "query_type": "comparative",
      "model_used": "llama-3.1-8b-instant",
      "created_at": "2024-03-20T14:30:00Z",
      "updated_at": "2024-03-20T14:35:00Z"
    }
  ],
  "total": 1
}

Get Report

GET /api/v1/report/{report_id}

Get a specific report.

Headers:

  • Authorization: Bearer {access_token}

Path Parameters:

  • report_id: ID of the report

Response (200 OK):

{
  "id": "report-uuid",
  "user_id": "user-uuid",
  "search_id": "search-uuid",
  "title": "Environmental Impacts of Electric Vehicles",
  "content": "# Environmental Impacts of Electric Vehicles\n\n## Introduction\n\nElectric vehicles (EVs) have gained popularity...",
  "detail_level": "standard",
  "query_type": "comparative",
  "model_used": "llama-3.1-8b-instant",
  "created_at": "2024-03-20T14:30:00Z",
  "updated_at": "2024-03-20T14:35:00Z"
}

Download Report

GET /api/v1/report/{report_id}/download

Download a report in the specified format.

Headers:

  • Authorization: Bearer {access_token}

Path Parameters:

  • report_id: ID of the report

Query Parameters:

  • format (optional, default: "markdown"): Format of the report (markdown, html, pdf)

Response (200 OK):

  • Content-Type: application/octet-stream
  • Content-Disposition: attachment; filename="report_{report_id}.{format}"
  • Binary file content

Delete Report

DELETE /api/v1/report/{report_id}

Delete a report.

Headers:

  • Authorization: Bearer {access_token}

Path Parameters:

  • report_id: ID of the report to delete

Response (204 No Content)

Error Handling

The API returns standard HTTP status codes to indicate the success or failure of a request.

Common Error Codes

  • 400 Bad Request: The request was invalid or cannot be served
  • 401 Unauthorized: Authentication is required or has failed
  • 403 Forbidden: The authenticated user doesn't have the necessary permissions
  • 404 Not Found: The requested resource was not found
  • 422 Unprocessable Entity: The request data failed validation
  • 500 Internal Server Error: An error occurred on the server

Error Response Format

{
  "detail": "Error message explaining what went wrong"
}

Best Practices for Frontend Integration

  1. Authentication Flow:

    • Implement a login form that sends credentials to /api/v1/auth/token
    • Store the received token securely (HTTP-only cookies or secure storage)
    • Include the token in the Authorization header for all subsequent requests
    • Implement token expiration handling and refresh mechanism
  2. Query Processing Workflow:

    • Allow users to enter natural language queries
    • Use the /api/v1/query/process endpoint to enhance the query
    • Display the enhanced query to the user for confirmation
  3. Search Execution:

    • Use the processed query for search execution
    • Allow users to select which search engines to use
    • Implement a loading state while waiting for search results
    • Display search results grouped by search engine
  4. Report Generation:

    • Allow users to generate reports from search results
    • Provide options for detail level and report type
    • Implement progress tracking using the progress endpoint
    • Allow users to download reports in different formats
  5. Error Handling:

    • Implement proper error handling for API responses
    • Display meaningful error messages to users
    • Implement retry mechanisms for transient errors

Available Search Engines

  • google: General web search
  • arxiv: Academic papers from arXiv
  • scholar: Academic papers from various sources
  • news: News articles
  • openalex: Open access academic content
  • core: Open access research papers
  • github: Code repositories
  • stackexchange: Q&A from Stack Exchange network

Report Detail Levels

  • brief: Short summary (default model: llama-3.1-8b-instant)
  • standard: Comprehensive overview (default model: llama-3.1-8b-instant)
  • detailed: In-depth analysis (default model: llama-3.3-70b-versatile)
  • comprehensive: Extensive research report (default model: llama-3.3-70b-versatile)

Query Types

  • factual: Seeking facts or information
  • comparative: Comparing multiple items or concepts
  • exploratory: Open-ended exploration of a topic
  • procedural: How to do something
  • causal: Seeking cause-effect relationships

Models

  • llama-3.1-8b-instant: Fast, lightweight model
  • llama-3.3-70b-versatile: High-quality, comprehensive model
  • Other models may be available based on server configuration