ira/.note/api_specification.md

587 lines
14 KiB
Markdown

# Sim-Search API Specification
This document provides a comprehensive guide for frontend developers to integrate with the Sim-Search API. The API offers intelligent research capabilities, including query processing, search execution across multiple engines, and report generation.
## API Base URL
```
/api/v1
```
## Authentication
The API uses OAuth2 with Bearer token authentication. All API endpoints except for authentication endpoints require a valid Bearer token.
### Register a New User
```
POST /api/v1/auth/register
```
Register a new user account.
**Request Body**:
```json
{
"email": "user@example.com",
"password": "password123",
"full_name": "User Name",
"is_active": true,
"is_superuser": false
}
```
**Response** (200 OK):
```json
{
"id": "user-uuid",
"email": "user@example.com",
"full_name": "User Name",
"is_active": true,
"is_superuser": false
}
```
### Login to Get Access Token
```
POST /api/v1/auth/token
```
Obtain an access token for API authentication.
**Request Body (form data)**:
```
username=user@example.com
password=password123
```
**Response** (200 OK):
```json
{
"access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9...",
"token_type": "bearer"
}
```
## Query Processing
### Process a Query
```
POST /api/v1/query/process
```
Process a search query to enhance and structure it for better search results.
**Headers**:
- Authorization: Bearer {access_token}
**Request Body**:
```json
{
"query": "What are the latest advancements in quantum computing?"
}
```
**Response** (200 OK):
```json
{
"original_query": "What are the latest advancements in quantum computing?",
"structured_query": {
"original_query": "What are the latest advancements in quantum computing?",
"enhanced_query": "What are the recent breakthroughs and developments in quantum computing technology, algorithms, and applications in the past 2 years?",
"type": "exploratory",
"intent": "research",
"domain": "academic",
"confidence": 0.95,
"reasoning": "This query is asking about recent developments in a scientific field, which is typical of academic research.",
"entities": ["quantum computing", "advancements"],
"sub_questions": [
{
"sub_question": "What are the latest hardware advancements in quantum computing?",
"aspect": "hardware",
"priority": 0.9
},
{
"sub_question": "What are the recent algorithmic breakthroughs in quantum computing?",
"aspect": "algorithms",
"priority": 0.8
}
],
"search_queries": {
"google": "latest advancements in quantum computing 2024",
"scholar": "recent quantum computing breakthroughs",
"arxiv": "quantum computing hardware algorithms"
},
"is_academic": true,
"is_code": false,
"is_current_events": false
}
}
```
### Classify a Query
```
POST /api/v1/query/classify
```
Classify a query by type and intent.
**Headers**:
- Authorization: Bearer {access_token}
**Request Body**:
```json
{
"query": "What are the latest advancements in quantum computing?"
}
```
**Response** (200 OK):
```json
{
"original_query": "What are the latest advancements in quantum computing?",
"structured_query": {
"original_query": "What are the latest advancements in quantum computing?",
"type": "exploratory",
"domain": "academic",
"confidence": 0.95
}
}
```
## Search Execution
### Get Available Search Engines
```
GET /api/v1/search/engines
```
Get a list of available search engines.
**Headers**:
- Authorization: Bearer {access_token}
**Response** (200 OK):
```json
["google", "arxiv", "scholar", "news", "openalex", "core", "github", "stackexchange"]
```
### Execute a Search
```
POST /api/v1/search/execute
```
Execute a search with the given parameters.
**Headers**:
- Authorization: Bearer {access_token}
**Request Body**:
```json
{
"structured_query": {
"original_query": "What are the environmental impacts of electric vehicles?",
"enhanced_query": "What are the environmental impacts of electric vehicles?",
"type": "factual",
"domain": "environmental"
},
"search_engines": ["google", "arxiv"],
"num_results": 5,
"timeout": 30
}
```
**Response** (200 OK):
```json
{
"search_id": "search-uuid",
"query": "What are the environmental impacts of electric vehicles?",
"enhanced_query": "What are the environmental impacts of electric vehicles?",
"results": {
"google": [
{
"title": "Environmental Impacts of Electric Vehicles",
"url": "https://example.com/article1",
"snippet": "Electric vehicles have several environmental impacts including...",
"source": "google",
"score": 0.95
}
],
"arxiv": [
{
"title": "Lifecycle Analysis of Electric Vehicle Environmental Impact",
"url": "http://arxiv.org/abs/paper123",
"pdf_url": "http://arxiv.org/pdf/paper123",
"snippet": "This paper analyzes the complete lifecycle environmental impact of electric vehicles...",
"source": "arxiv",
"authors": ["Researcher Name1", "Researcher Name2"],
"arxiv_id": "paper123",
"categories": ["cs.CY", "eess.SY"],
"published_date": "2023-01-15T10:30:00Z",
"score": 0.92
}
]
},
"total_results": 2,
"execution_time": 1.25,
"timestamp": "2024-03-20T14:25:30Z"
}
```
### Get Search History
```
GET /api/v1/search/history
```
Get the user's search history.
**Headers**:
- Authorization: Bearer {access_token}
**Query Parameters**:
- skip (optional, default: 0): Number of records to skip
- limit (optional, default: 100): Maximum number of records to return
**Response** (200 OK):
```json
{
"searches": [
{
"id": "search-uuid",
"query": "What are the environmental impacts of electric vehicles?",
"enhanced_query": "What are the environmental impacts of electric vehicles?",
"query_type": "factual",
"engines": "google,arxiv",
"results_count": 10,
"created_at": "2024-03-20T14:25:30Z"
}
],
"total": 1
}
```
### Get Search Results
```
GET /api/v1/search/{search_id}
```
Get results for a specific search.
**Headers**:
- Authorization: Bearer {access_token}
**Path Parameters**:
- search_id: ID of the search
**Response** (200 OK):
```json
{
"search_id": "search-uuid",
"query": "What are the environmental impacts of electric vehicles?",
"enhanced_query": "What are the environmental impacts of electric vehicles?",
"results": {
"google": [
{
"title": "Environmental Impacts of Electric Vehicles",
"url": "https://example.com/article1",
"snippet": "Electric vehicles have several environmental impacts including...",
"source": "google",
"score": 0.95
}
],
"arxiv": [
{
"title": "Lifecycle Analysis of Electric Vehicle Environmental Impact",
"url": "http://arxiv.org/abs/paper123",
"pdf_url": "http://arxiv.org/pdf/paper123",
"snippet": "This paper analyzes the complete lifecycle environmental impact of electric vehicles...",
"source": "arxiv",
"authors": ["Researcher Name1", "Researcher Name2"],
"arxiv_id": "paper123",
"categories": ["cs.CY", "eess.SY"],
"published_date": "2023-01-15T10:30:00Z",
"score": 0.92
}
]
},
"total_results": 2,
"execution_time": 0.0
}
```
### Delete Search
```
DELETE /api/v1/search/{search_id}
```
Delete a search from history.
**Headers**:
- Authorization: Bearer {access_token}
**Path Parameters**:
- search_id: ID of the search to delete
**Response** (204 No Content)
## Report Generation
### Generate a Report
```
POST /api/v1/report/generate
```
Generate a report from search results.
**Headers**:
- Authorization: Bearer {access_token}
**Request Body**:
```json
{
"search_id": "search-uuid",
"query": "What are the environmental impacts of electric vehicles?",
"detail_level": "standard",
"query_type": "comparative",
"model": "llama-3.1-8b-instant",
"title": "Environmental Impacts of Electric Vehicles"
}
```
**Response** (200 OK):
```json
{
"id": "report-uuid",
"user_id": "user-uuid",
"search_id": "search-uuid",
"title": "Environmental Impacts of Electric Vehicles",
"content": "Report generation in progress...",
"detail_level": "standard",
"query_type": "comparative",
"model_used": "llama-3.1-8b-instant",
"created_at": "2024-03-20T14:30:00Z",
"updated_at": "2024-03-20T14:30:00Z"
}
```
### Get Report Generation Progress
```
GET /api/v1/report/{report_id}/progress
```
Get the progress of a report generation.
**Headers**:
- Authorization: Bearer {access_token}
**Path Parameters**:
- report_id: ID of the report
**Response** (200 OK):
```json
{
"report_id": "report-uuid",
"progress": 0.75,
"status": "Processing chunk 3/4...",
"current_chunk": 3,
"total_chunks": 4,
"current_report": "The environmental impacts of electric vehicles include..."
}
```
### Get Report List
```
GET /api/v1/report/list
```
Get a list of user's reports.
**Headers**:
- Authorization: Bearer {access_token}
**Query Parameters**:
- skip (optional, default: 0): Number of records to skip
- limit (optional, default: 100): Maximum number of records to return
**Response** (200 OK):
```json
{
"reports": [
{
"id": "report-uuid",
"user_id": "user-uuid",
"search_id": "search-uuid",
"title": "Environmental Impacts of Electric Vehicles",
"content": "# Environmental Impacts of Electric Vehicles\n\n## Introduction\n\nElectric vehicles (EVs) have gained popularity...",
"detail_level": "standard",
"query_type": "comparative",
"model_used": "llama-3.1-8b-instant",
"created_at": "2024-03-20T14:30:00Z",
"updated_at": "2024-03-20T14:35:00Z"
}
],
"total": 1
}
```
### Get Report
```
GET /api/v1/report/{report_id}
```
Get a specific report.
**Headers**:
- Authorization: Bearer {access_token}
**Path Parameters**:
- report_id: ID of the report
**Response** (200 OK):
```json
{
"id": "report-uuid",
"user_id": "user-uuid",
"search_id": "search-uuid",
"title": "Environmental Impacts of Electric Vehicles",
"content": "# Environmental Impacts of Electric Vehicles\n\n## Introduction\n\nElectric vehicles (EVs) have gained popularity...",
"detail_level": "standard",
"query_type": "comparative",
"model_used": "llama-3.1-8b-instant",
"created_at": "2024-03-20T14:30:00Z",
"updated_at": "2024-03-20T14:35:00Z"
}
```
### Download Report
```
GET /api/v1/report/{report_id}/download
```
Download a report in the specified format.
**Headers**:
- Authorization: Bearer {access_token}
**Path Parameters**:
- report_id: ID of the report
**Query Parameters**:
- format (optional, default: "markdown"): Format of the report (markdown, html, pdf)
**Response** (200 OK):
- Content-Type: application/octet-stream
- Content-Disposition: attachment; filename="report_{report_id}.{format}"
- Binary file content
### Delete Report
```
DELETE /api/v1/report/{report_id}
```
Delete a report.
**Headers**:
- Authorization: Bearer {access_token}
**Path Parameters**:
- report_id: ID of the report to delete
**Response** (204 No Content)
## Error Handling
The API returns standard HTTP status codes to indicate the success or failure of a request.
### Common Error Codes
- 400 Bad Request: The request was invalid or cannot be served
- 401 Unauthorized: Authentication is required or has failed
- 403 Forbidden: The authenticated user doesn't have the necessary permissions
- 404 Not Found: The requested resource was not found
- 422 Unprocessable Entity: The request data failed validation
- 500 Internal Server Error: An error occurred on the server
### Error Response Format
```json
{
"detail": "Error message explaining what went wrong"
}
```
## Best Practices for Frontend Integration
1. **Authentication Flow**:
- Implement a login form that sends credentials to `/api/v1/auth/token`
- Store the received token securely (HTTP-only cookies or secure storage)
- Include the token in the Authorization header for all subsequent requests
- Implement token expiration handling and refresh mechanism
2. **Query Processing Workflow**:
- Allow users to enter natural language queries
- Use the `/api/v1/query/process` endpoint to enhance the query
- Display the enhanced query to the user for confirmation
3. **Search Execution**:
- Use the processed query for search execution
- Allow users to select which search engines to use
- Implement a loading state while waiting for search results
- Display search results grouped by search engine
4. **Report Generation**:
- Allow users to generate reports from search results
- Provide options for detail level and report type
- Implement progress tracking using the progress endpoint
- Allow users to download reports in different formats
5. **Error Handling**:
- Implement proper error handling for API responses
- Display meaningful error messages to users
- Implement retry mechanisms for transient errors
## Available Search Engines
- **google**: General web search
- **arxiv**: Academic papers from arXiv
- **scholar**: Academic papers from various sources
- **news**: News articles
- **openalex**: Open access academic content
- **core**: Open access research papers
- **github**: Code repositories
- **stackexchange**: Q&A from Stack Exchange network
## Report Detail Levels
- **brief**: Short summary (default model: llama-3.1-8b-instant)
- **standard**: Comprehensive overview (default model: llama-3.1-8b-instant)
- **detailed**: In-depth analysis (default model: llama-3.3-70b-versatile)
- **comprehensive**: Extensive research report (default model: llama-3.3-70b-versatile)
## Query Types
- **factual**: Seeking facts or information
- **comparative**: Comparing multiple items or concepts
- **exploratory**: Open-ended exploration of a topic
- **procedural**: How to do something
- **causal**: Seeking cause-effect relationships
## Models
- **llama-3.1-8b-instant**: Fast, lightweight model
- **llama-3.3-70b-versatile**: High-quality, comprehensive model
- **Other models may be available based on server configuration**