9.9 KiB

Raw Blame History

Decision Log

2025-02-27: Initial Project Setup

Decision: Use Jina AI APIs for Semantic Search

Context: Need for semantic search capabilities that understand context beyond keywords
Options Considered:
1. Build custom embedding solution
2. Use open-source models locally
3. Use Jina AI's APIs
Decision: Use Jina AI's APIs for embedding generation and similarity computation
Rationale:
- High-quality embeddings with state-of-the-art models
- No need to manage model deployment and infrastructure
- Simple API integration with reasonable pricing
- Support for long texts through segmentation

Decision: Separate Markdown Segmentation from Similarity Computation

Context: Need to handle potentially long markdown documents
Options Considered:
1. Integrate segmentation directly into the similarity module
2. Create a separate module for segmentation
Decision: Create a separate module (markdown_segmenter.py) for document segmentation
Rationale:
- Better separation of concerns
- More modular design allows for independent use of components
- Easier to maintain and extend each component separately

Decision: Use Environment Variables for API Keys

Context: Need to securely manage API credentials
Options Considered:
1. Configuration files
2. Environment variables
3. Secret management service
Decision: Use environment variables (JINA_API_KEY)
Rationale:
- Simple to implement
- Standard practice for managing secrets
- Works well across different environments
- Prevents accidental commit of credentials to version control

Decision: Use Cosine Similarity with Normalized Vectors

Context: Need a metric for comparing semantic similarity between text embeddings
Options Considered:
1. Euclidean distance
2. Cosine similarity
3. Dot product
Decision: Use cosine similarity with normalized vectors
Rationale:
- Standard approach for semantic similarity
- Normalized vectors simplify computation (dot product equals cosine similarity)
- Less sensitive to embedding magnitude, focusing on direction (meaning)

2025-02-27: Research System Architecture

Decision: Implement a Multi-Stage Research Pipeline

Context: Need to define the overall architecture for the intelligent research system
Options Considered:
1. Monolithic application with tightly coupled components
2. Microservices architecture with independent services
3. Pipeline architecture with distinct processing stages
Decision: Implement an 8-stage pipeline architecture
Rationale:
- Clear separation of concerns with each stage having a specific responsibility
- Easier to develop and test individual components
- Flexibility to swap or enhance specific stages without affecting others
- Natural flow of data through the system matches the research process

Decision: Use Multiple Search Sources

Context: Need to gather comprehensive information from various sources
Options Considered:
1. Use a single search API for simplicity
2. Implement custom web scraping for all sources
3. Use multiple specialized search APIs
Decision: Integrate multiple search sources (Google, Serper, Jina Search, Google Scholar, arXiv)
Rationale:
- Different sources provide different types of information (academic, general, etc.)
- Increases the breadth and diversity of search results
- Specialized APIs like arXiv provide domain-specific information
- Redundancy ensures more comprehensive coverage

Decision: Use Jina AI for Semantic Processing

Context: Need for advanced semantic understanding in document processing
Options Considered:
1. Use simple keyword matching
2. Implement custom embedding models
3. Use Jina AI's suite of APIs
Decision: Use Jina AI's APIs for embedding generation, similarity computation, and reranking
Rationale:
- High-quality embeddings with state-of-the-art models
- Comprehensive API suite covering multiple needs (embeddings, segmentation, reranking)
- Simple integration with reasonable pricing
- Consistent approach across different semantic processing tasks

2025-02-27: Search Execution Architecture

Decision: Search Execution Architecture

Context: We needed to implement a search execution module that could execute search queries across multiple search engines and process the results in a standardized way.
Decision:
1. Create a modular search execution architecture:
- Implement a base handler interface (BaseSearchHandler) for all search API handlers
- Create specific handlers for each search engine (Google, Serper, Scholar, arXiv)
- Develop a central SearchExecutor class to manage execution across multiple engines
- Implement a ResultCollector class for processing and organizing results
1. Use parallel execution for search queries:
- Implement thread-based parallelism using concurrent.futures
- Add support for both synchronous and asynchronous execution
- Include timeout management and error handling
1. Standardize search results:
- Define a common result format across all search engines
- Include metadata specific to each search engine in a standardized way
- Implement deduplication and scoring for result ranking
Rationale:
- A modular architecture allows for easy addition of new search engines
- Parallel execution significantly improves search performance
- Standardized result format simplifies downstream processing
- Separation of concerns between execution and result processing
Alternatives Considered:
1. Sequential execution of search queries:
- Simpler implementation
- Much slower performance
- Would not scale well with additional search engines
1. Separate modules for each search engine:
- Would lead to code duplication
- More difficult to maintain
- Less consistent result format
1. Using a third-party search aggregation service:
- Would introduce additional dependencies
- Less control over the search process
- Potential cost implications
Impact:
- Efficient execution of search queries across multiple engines
- Consistent result format for downstream processing
- Flexible architecture that can be extended with new search engines
- Clear separation of concerns between different components

Decision: Remove Google Search Handler

Context: Both Google and Serper handlers were implemented, but Serper is essentially a front-end for Google search
Options Considered:
1. Keep both handlers for redundancy
2. Remove the Google handler and only use Serper
Decision: Remove the Google search handler
Rationale:
- Redundant functionality as Serper provides the same results
- Simplifies the codebase and reduces maintenance
- Reduces API costs by avoiding duplicate searches
- Serper provides a more reliable and consistent API for Google search

Decision: Modify LLM Query Enhancement Prompt

Context: The LLM was returning enhanced queries with explanations, which caused issues with search APIs
Options Considered:
1. Post-process the LLM output to extract just the query
2. Modify the prompt to request only the enhanced query
Decision: Modify the LLM prompt to request only the enhanced query without explanations
Rationale:
- More reliable than post-processing, which could be error-prone
- Cleaner implementation that addresses the root cause
- Ensures consistent output format for downstream processing
- Reduces the risk of exceeding API character limits

Decision: Implement Query Truncation

Context: Enhanced queries could exceed the Serper API's 2048 character limit
Options Considered:
1. Limit the LLM's output length
2. Truncate queries before sending to the API
3. Split long queries into multiple searches
Decision: Implement query truncation in the search executor
Rationale:
- Simple and effective solution
- Preserves as much of the enhanced query as possible
- Ensures API requests don't fail due to length constraints
- Can be easily adjusted if API limits change

2025-02-27: Testing Strategy for Query Processor

Context

After integrating Groq and OpenRouter as additional LLM providers, we needed to verify that the query processor module functions correctly with these new providers.

Decision

Create dedicated test scripts to validate the query processor functionality:
- A basic test script for the core processing pipeline
- A comprehensive test script for detailed component testing
Use monkey patching to ensure tests consistently use the Groq model:
- Create a global LLM interface with the Groq model
- Override the get_llm_interface function to always return this interface
- This approach allows testing without modifying the core code
Test all key functionality of the query processor:
- Query enhancement
- Query classification
- Search query generation
- End-to-end processing pipeline

Rationale

Dedicated test scripts provide a repeatable way to verify functionality
Monkey patching allows testing with specific models without changing the core code
Comprehensive testing ensures all components work correctly with the new providers
Saving test results to a JSON file provides a reference for future development

Alternatives Considered

Modifying the query processor to accept a model parameter:
- Would require changing the core code
- Could introduce bugs in the production code
Using environment variables to control model selection:
- Less precise control over which model is used
- Could interfere with other tests or production use

Impact

Verified that the query processor works correctly with Groq models
Established a testing approach that can be used for other modules
Created reusable test scripts for future development

9.9 KiB Raw Blame History

Decision Log

2025-02-27: Initial Project Setup

Decision: Use Jina AI APIs for Semantic Search

Decision: Separate Markdown Segmentation from Similarity Computation

Decision: Use Environment Variables for API Keys

Decision: Use Cosine Similarity with Normalized Vectors

2025-02-27: Research System Architecture

Decision: Implement a Multi-Stage Research Pipeline

Decision: Use Multiple Search Sources

Decision: Use Jina AI for Semantic Processing

2025-02-27: Search Execution Architecture

Decision: Search Execution Architecture

2025-02-27: Search Execution Module Refinements

Decision: Remove Google Search Handler

Decision: Modify LLM Query Enhancement Prompt

Decision: Implement Query Truncation

2025-02-27: Testing Strategy for Query Processor

Context

Decision

Rationale

Alternatives Considered

Impact

9.9 KiB

Raw Blame History