# Papers A Go CLI tool for fetching, processing, and analyzing academic papers from arXiv using LLM-based evaluation. ## Features - Fetch papers from arXiv API based on date range and search query - Process papers using configurable LLM models (default: phi-4) - Generate both JSON and Markdown outputs - Customizable evaluation criteria - Rate-limited API requests (2-second delay between requests) ## Installation ```bash go install gitea.r8z.us/stwhite/papers@latest ``` ## Usage Basic usage: ```bash papers -start 20240101 -end 20240131 -query "machine learning" -api-key "your-key" ``` With custom model and output paths: ```bash papers -start 20240101 -end 20240131 -query "machine learning" -api-key "your-key" \ -model "gpt-4" -json-output "results.json" -md-output "summary.md" ``` Fetch papers without processing: ```bash papers -search-only -start 20240101 -end 20240131 -query "machine learning" ``` Use input file: ```bash papers -input papers.json -api-key "your-key" ``` ### Required Flags - `-start`: Start date (YYYYMMDD format) - `-end`: End date (YYYYMMDD format) - `-query`: Search query ### Optional Flags - `-search-only`: Fetch papers from arXiv and save to JSON file without processing - `-input`: Input JSON file containing papers (optional) - `-maxResults`: Maximum number of results to fetch (1-2000, default: 100) - `-model`: LLM model to use for processing (default: "phi-4") - `-api-endpoint`: API endpoint URL (default: "http://localhost:1234/v1/chat/completions") - `-criteria`: Path to evaluation criteria markdown file (default: "criteria.md") - `-json-output`: Custom JSON output file path (default: YYYYMMDD-YYYYMMDD-query.json) - `-md-output`: Custom Markdown output file path (default: YYYYMMDD-YYYYMMDD-query.md) ## Pipeline 1. **Fetch**: Retrieves papers from arXiv based on specified date range and query 2. **Save**: Stores raw paper data in JSON format 3. **Process**: Evaluates papers using the specified LLM model according to criteria 4. **Format**: Generates both JSON and Markdown outputs of the processed results ## Output Files The tool generates two types of output files: 1. **JSON Output**: Contains the raw processing results - Default name format: `YYYYMMDD-YYYYMMDD-query.json` - Can be customized with `-json-output` flag 2. **Markdown Output**: Human-readable formatted results - Default name format: `YYYYMMDD-YYYYMMDD-query.md` - Can be customized with `-md-output` flag ## Dependencies - [arxiva](gitea.r8z.us/stwhite/arxiva): Paper fetching from arXiv - [paperprocessor](gitea.r8z.us/stwhite/paperprocessor): LLM-based paper processing - [paperformatter](gitea.r8z.us/stwhite/paperformatter): Output formatting ## Error Handling The tool includes various error checks: - Date format validation (YYYYMMDD) - Required flag validation - Maximum results range validation (1-2000) - File system operations verification - API request error handling ## License [License information not provided in source]