Updated to shrink big images before sending them.
This commit is contained in:
parent
11ea971542
commit
d4ea970b3d
|
@ -0,0 +1,38 @@
|
|||
# PyNamer Project Rules
|
||||
|
||||
**Implementation Patterns:**
|
||||
1. **Image Processing:**
|
||||
- Always maintain aspect ratio when resizing.
|
||||
- Use LANCZOS resampling for quality downscaling.
|
||||
- Handle transparency conversion when saving as JPEG.
|
||||
- Keep original image files untouched until final rename operation.
|
||||
|
||||
2. **Filename Generation:**
|
||||
- Enforce snake_case format.
|
||||
- Remove special characters.
|
||||
- Handle duplicate filenames by appending incrementing numbers.
|
||||
|
||||
3. **Error Handling:**
|
||||
- Log detailed errors for debugging.
|
||||
- Fail gracefully with clear user feedback.
|
||||
- Preserve original files on errors.
|
||||
|
||||
4. **Configuration:**
|
||||
- Sensible defaults for all configurable parameters.
|
||||
- Environment variables can override sensitive settings (API keys).
|
||||
- Config changes require restart (no hot-reloading).
|
||||
|
||||
**User Preferences:**
|
||||
- Default to JPEG format for resized images (better compression).
|
||||
- Default max dimension of 1024px (balances quality and efficiency).
|
||||
- Dry-run mode enabled by flag for safety.
|
||||
|
||||
**Known Challenges:**
|
||||
- Large images may still consume significant memory during processing.
|
||||
- Some LLM models may have different optimal image sizes/formats.
|
||||
- Transparency handling requires special consideration when converting formats.
|
||||
|
||||
**Workflow Patterns:**
|
||||
- Always check file existence and supported formats first.
|
||||
- Process images sequentially (no parallel processing yet).
|
||||
- Log each major operation step for traceability.
|
|
@ -1,2 +1,170 @@
|
|||
# Byte-compiled / optimized / DLL files
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
|
||||
# C extensions
|
||||
*.so
|
||||
|
||||
# Distribution / packaging
|
||||
.Python
|
||||
build/
|
||||
develop-eggs/
|
||||
dist/
|
||||
downloads/
|
||||
eggs/
|
||||
.eggs/
|
||||
lib/
|
||||
lib64/
|
||||
parts/
|
||||
sdist/
|
||||
var/
|
||||
wheels/
|
||||
*.egg-info/
|
||||
.installed.cfg
|
||||
*.egg
|
||||
MANIFEST
|
||||
|
||||
# PyInstaller
|
||||
# Usually these files are written by a python script from a template
|
||||
# before PyInstaller builds the exe, so as to inject date/other infos into it.
|
||||
*.manifest
|
||||
*.spec
|
||||
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
pip-delete-this-directory.txt
|
||||
|
||||
# Unit test / coverage reports
|
||||
htmlcov/
|
||||
.tox/
|
||||
.nox/
|
||||
.coverage
|
||||
.coverage.*
|
||||
.cache
|
||||
nosetests.xml
|
||||
coverage.xml
|
||||
*.cover
|
||||
*.py,cover
|
||||
.hypothesis/
|
||||
.pytest_cache/
|
||||
|
||||
# Translations
|
||||
*.mo
|
||||
*.pot
|
||||
|
||||
# Django stuff:
|
||||
*.log
|
||||
local_settings.py
|
||||
db.sqlite3
|
||||
|
||||
# Flask stuff:
|
||||
instance/
|
||||
.webassets-cache
|
||||
|
||||
# Scrapy stuff:
|
||||
.scrapy
|
||||
|
||||
# Sphinx documentation
|
||||
docs/_build/
|
||||
|
||||
# PyBuilder
|
||||
target/
|
||||
|
||||
# Jupyter Notebook
|
||||
.ipynb_checkpoints
|
||||
|
||||
# IPython
|
||||
profile_default/
|
||||
ipython_config.py
|
||||
|
||||
# pyenv
|
||||
.python-version
|
||||
|
||||
# pipenv
|
||||
# According to pypa/pipenv#598, it's recommended to include Pipfile.lock in version control.
|
||||
# However, in case of collaboration, if having platform-specific dependencies or dependencies
|
||||
# having no cross-platform support, pipenv may install dependencies that don't work, or not
|
||||
# install all needed dependencies.
|
||||
#Pipfile.lock
|
||||
|
||||
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
|
||||
__pypackages__/
|
||||
|
||||
# Celery stuff
|
||||
celerybeat-schedule
|
||||
celerybeat.pid
|
||||
|
||||
# SageMath parsed files
|
||||
*.sage.py
|
||||
|
||||
# Environments
|
||||
.env
|
||||
.venv
|
||||
env/
|
||||
venv/
|
||||
ENV/
|
||||
env.bak/
|
||||
venv.bak/
|
||||
|
||||
# Spyder project settings
|
||||
.spyderproject
|
||||
.spyproject
|
||||
|
||||
# Rope project settings
|
||||
.ropeproject
|
||||
|
||||
# mkdocs documentation
|
||||
/site
|
||||
|
||||
# mypy
|
||||
.mypy_cache/
|
||||
.dmypy.json
|
||||
dmypy.json
|
||||
|
||||
# Pyre type checker
|
||||
.pyre/
|
||||
|
||||
# pytype static type analyzer
|
||||
.pytype/
|
||||
|
||||
# operating system files
|
||||
.DS_Store
|
||||
.venv/
|
||||
.DS_Store?
|
||||
._*
|
||||
.Spotlight-V100
|
||||
.Trashes
|
||||
ehthumbs.db
|
||||
Thumbs.db
|
||||
|
||||
# Image files
|
||||
*.png
|
||||
*.jpg
|
||||
*.jpeg
|
||||
*.gif
|
||||
*.bmp
|
||||
*.tiff
|
||||
|
||||
# Log files
|
||||
*.log
|
||||
|
||||
# Editor directories and files
|
||||
.idea/
|
||||
.vscode/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
*.bak
|
||||
*.tmp
|
||||
*.orig
|
||||
*.class
|
||||
*.jar
|
||||
*.war
|
||||
*.ear
|
||||
*.zip
|
||||
*.tar.gz
|
||||
*.rar
|
||||
|
||||
# Local development files
|
||||
*.local
|
||||
*.dev
|
||||
|
|
12
config.yaml
12
config.yaml
|
@ -2,10 +2,10 @@
|
|||
|
||||
# LLM API Configuration
|
||||
llm:
|
||||
provider: "openai" # Provider name (openai, anthropic, etc.)
|
||||
model: "gpt-4o-mini" # Model name
|
||||
api_key: "" # Your API key (leave empty to use environment variable)
|
||||
endpoint: "" # Custom endpoint URL (if using a proxy or alternative service)
|
||||
provider: "openrouter" # Supported: openai, anthropic, openrouter
|
||||
model: "openrouter/google/gemma-3-27b-it" # Must be a vision-capable model
|
||||
api_key: "" # Your API key (or set OPENAI_API_KEY environment variable)
|
||||
endpoint: "" # Custom endpoint URL if needed
|
||||
max_tokens: 100 # Maximum tokens for response
|
||||
temperature: 0.7 # Temperature for generation
|
||||
|
||||
|
@ -17,7 +17,9 @@ image:
|
|||
- ".png"
|
||||
- ".gif"
|
||||
- ".webp"
|
||||
|
||||
resize_max_dimension: 1024 # Max width/height before resizing
|
||||
resize_format: "JPEG" # Format for resized images
|
||||
|
||||
# Prompt Configuration
|
||||
prompt:
|
||||
system_message: "You are a helpful assistant that generates concise, descriptive filenames for images. Focus on the main subject, key attributes, and context. Use snake_case format without special characters."
|
||||
|
|
|
@ -0,0 +1,25 @@
|
|||
# Active Context: PyNamer - Image Resizing Implementation
|
||||
|
||||
**Current Focus:** Implementing image resizing functionality to normalize image dimensions before sending them to the LLM.
|
||||
|
||||
**Decisions Made:**
|
||||
- Use the `Pillow` library for image manipulation due to its robustness and ease of use in Python.
|
||||
- Add `Pillow` to `requirements.txt`.
|
||||
- Introduce configuration options in `config.yaml` under the `image` section:
|
||||
- `resize_max_dimension`: Controls the maximum size (width or height) of the image sent to the LLM. Defaults to 1024.
|
||||
- `resize_format`: Specifies the image format (e.g., 'JPEG', 'PNG') to use after resizing. Defaults to 'JPEG'.
|
||||
- Modify the `_encode_image` method (renamed to `_resize_and_encode_image`) to perform resizing:
|
||||
- Open the image using `PIL.Image.open()`.
|
||||
- Check if the image's largest dimension exceeds `resize_max_dimension`.
|
||||
- If it exceeds, calculate new dimensions maintaining aspect ratio and resize using `img.resize()` with `Image.Resampling.LANCZOS`.
|
||||
- Save the (potentially resized) image to an in-memory buffer (`io.BytesIO`) using the configured `resize_format`.
|
||||
- Handle potential transparency issues when saving formats like JPEG by converting the image mode to 'RGB' if necessary.
|
||||
- Base64 encode the bytes from the buffer.
|
||||
- Update the `generate_filename` method to call `_resize_and_encode_image` instead of `_encode_image`.
|
||||
- Update the `generate_filename` method to dynamically set the `mime_type` in the LLM request based on `resize_format`.
|
||||
- Load the new configuration options in `_setup_llm`.
|
||||
|
||||
**Next Steps:**
|
||||
- Create `memory-bank/progress.md`.
|
||||
- Create `.clinerules`.
|
||||
- Final review and testing.
|
|
@ -0,0 +1,16 @@
|
|||
# Product Context: PyNamer
|
||||
|
||||
**Problem:** Manually naming large numbers of image files is tedious and time-consuming. Generic filenames (e.g., `IMG_1234.JPG`) lack descriptive value, making it hard to find specific images later.
|
||||
|
||||
**Solution:** `pynamer` automates the process of generating descriptive filenames for images by leveraging the image understanding capabilities of multimodal LLMs.
|
||||
|
||||
**User Experience:**
|
||||
- The user provides one or more image paths via the command line.
|
||||
- The tool processes each image, interacts with an LLM (configured via `config.yaml`), and renames the file with a descriptive, clean filename.
|
||||
- A dry-run option allows users to preview the changes without modifying files.
|
||||
- **Efficiency Enhancement:** By resizing large images before sending them to the LLM, the tool aims to:
|
||||
- Reduce the amount of data transferred.
|
||||
- Potentially lower API costs (as some models charge based on input size/tokens).
|
||||
- Speed up the processing time.
|
||||
|
||||
**Target User:** Individuals or teams dealing with many images who need a better way to organize and retrieve them based on content (e.g., photographers, researchers, content creators).
|
|
@ -0,0 +1,28 @@
|
|||
# Progress: PyNamer - Image Resizing Implementation
|
||||
|
||||
**Completed:**
|
||||
1. Added Pillow dependency to `requirements.txt`.
|
||||
2. Updated `pynamer.py` with image resizing functionality:
|
||||
- Renamed `_encode_image` to `_resize_and_encode_image`.
|
||||
- Implemented image resizing logic using Pillow.
|
||||
- Added proper error handling for image processing.
|
||||
- Updated `generate_filename` to use the new method and set correct mime type.
|
||||
3. Updated `config.yaml` with new image resizing configuration options:
|
||||
- `resize_max_dimension`
|
||||
- `resize_format`
|
||||
4. Created comprehensive memory bank documentation:
|
||||
- `projectbrief.md`
|
||||
- `productContext.md`
|
||||
- `systemPatterns.md`
|
||||
- `techContext.md`
|
||||
- `activeContext.md`
|
||||
|
||||
**Remaining:**
|
||||
1. Create `.clinerules` file.
|
||||
2. Final testing and verification.
|
||||
|
||||
**Issues/Notes:**
|
||||
- The implementation maintains backward compatibility with existing configurations.
|
||||
- The default resize format is set to JPEG for better compression, but this may need adjustment for images with transparency.
|
||||
- The LANCZOS resampling filter provides good quality for downscaling.
|
||||
- Error handling has been improved to provide better feedback when image processing fails.
|
|
@ -0,0 +1,18 @@
|
|||
# Project Brief: PyNamer
|
||||
|
||||
**Goal:** Enhance the `pynamer` tool to improve efficiency and potentially reduce costs by normalizing image sizes before submitting them to a Large Language Model (LLM) for filename generation.
|
||||
|
||||
**Core Functionality:**
|
||||
- Takes one or more image file paths as input.
|
||||
- Reads configuration from `config.yaml`.
|
||||
- Resizes images exceeding a configured maximum dimension while maintaining aspect ratio.
|
||||
- Encodes the (potentially resized) image to base64.
|
||||
- Sends the image data and configured prompts to an LLM (via `litellm`).
|
||||
- Receives a descriptive filename suggestion from the LLM.
|
||||
- Cleans the suggested filename (snake_case, alphanumeric).
|
||||
- Renames the original image file with the new filename.
|
||||
- Supports dry-run mode.
|
||||
|
||||
**Enhancement:**
|
||||
- Added image resizing using the Pillow library before encoding and sending to the LLM.
|
||||
- Introduced configuration options (`resize_max_dimension`, `resize_format`) in `config.yaml`.
|
|
@ -0,0 +1,36 @@
|
|||
# System Patterns: PyNamer
|
||||
|
||||
**Architecture:** Command-Line Interface (CLI) tool.
|
||||
|
||||
**Core Components:**
|
||||
- **CLI Parser (`argparse`):** Handles command-line arguments (`images`, `config`, `dry-run`, `verbose`).
|
||||
- **Configuration Loader (`PyYAML`):** Loads settings from `config.yaml`.
|
||||
- **LLM Interaction (`litellm`):** Abstracts communication with various LLM providers. Handles API key and endpoint configuration.
|
||||
- **Image Processing (`Pillow`):**
|
||||
- Opens and reads image files.
|
||||
- Resizes images exceeding `resize_max_dimension` while maintaining aspect ratio.
|
||||
- Saves the processed image to a specified format (`resize_format`) in memory.
|
||||
- **Encoding (`base64`, `io`):** Encodes the processed image data for transmission via API.
|
||||
- **File System Interaction (`os`, `pathlib`):** Checks file existence, extracts paths/extensions, renames files.
|
||||
- **Filename Cleaning:** Simple string manipulation to enforce snake_case and remove invalid characters.
|
||||
- **Logging (`logging`):** Provides informative output about the process.
|
||||
|
||||
**Workflow Pattern:**
|
||||
1. Parse CLI arguments.
|
||||
2. Initialize `PyNamer` class with the config path.
|
||||
3. Load configuration (`_load_config`).
|
||||
4. Set up LLM client (`_setup_llm`), including image resize settings.
|
||||
5. Iterate through input image paths provided via CLI.
|
||||
6. For each image:
|
||||
a. Check existence and supported format (`_is_supported_format`).
|
||||
b. Resize and encode the image (`_resize_and_encode_image`).
|
||||
c. Prepare API request payload (prompts + image data).
|
||||
d. Call LLM via `litellm.completion`.
|
||||
e. Extract and clean the suggested filename.
|
||||
f. Construct the new file path.
|
||||
g. If not dry-run, rename the file, handling potential name collisions (`rename_image`).
|
||||
h. Log/print the outcome.
|
||||
|
||||
**Configuration Pattern:**
|
||||
- Centralized YAML file (`config.yaml`) for user-configurable settings (LLM details, API keys, prompts, image processing parameters).
|
||||
- Environment variables can override API keys/endpoints if not set in the config.
|
|
@ -0,0 +1,40 @@
|
|||
# Tech Context: PyNamer
|
||||
|
||||
**Language:** Python 3
|
||||
|
||||
**Core Libraries:**
|
||||
- `litellm`: For interacting with various LLM APIs (OpenAI, Anthropic, etc.). Handles model routing, API key management, and standardized response format.
|
||||
- `PyYAML`: For parsing the `config.yaml` configuration file.
|
||||
- `Pillow`: For image manipulation (opening, resizing, saving to buffer).
|
||||
- `argparse`: Standard library for parsing command-line arguments.
|
||||
- `base64`: Standard library for encoding image data.
|
||||
- `io`: Standard library for handling in-memory byte streams (used with Pillow).
|
||||
- `os`, `pathlib`: Standard libraries for file system operations.
|
||||
- `logging`: Standard library for application logging.
|
||||
|
||||
**Dependencies:**
|
||||
- Listed in `requirements.txt`.
|
||||
- Key dependencies: `litellm`, `pyyaml`, `Pillow`.
|
||||
|
||||
**Setup & Execution:**
|
||||
1. **Installation:**
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
# or potentially: pip install . (if setup.py is configured correctly)
|
||||
```
|
||||
2. **Configuration:**
|
||||
- Create or modify `config.yaml`.
|
||||
- Set LLM `api_key` in the config or via environment variable (e.g., `OPENAI_API_KEY`).
|
||||
- Adjust `model`, `max_tokens`, `temperature`, `resize_max_dimension`, `resize_format`, and `prompts` as needed.
|
||||
3. **Execution:**
|
||||
```bash
|
||||
python pynamer.py <image_path_1> [image_path_2 ...] [-c config.yaml] [-d] [-v]
|
||||
```
|
||||
- `<image_path>`: Path to the image file(s). Handles paths with spaces.
|
||||
- `-c`: Specify a different config file path.
|
||||
- `-d`: Dry run (preview changes).
|
||||
- `-v`: Verbose logging.
|
||||
|
||||
**Environment:**
|
||||
- Assumes a standard Python environment where dependencies can be installed via pip.
|
||||
- Relies on network access to reach the configured LLM API endpoint.
|
61
pynamer.py
61
pynamer.py
|
@ -2,14 +2,17 @@
|
|||
|
||||
import argparse
|
||||
import base64
|
||||
import io
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
import yaml
|
||||
from typing import Dict, List, Optional, Union
|
||||
|
||||
import litellm
|
||||
from litellm import completion
|
||||
import logging
|
||||
from PIL import Image # Added for image processing
|
||||
|
||||
# Configure logging
|
||||
logging.basicConfig(
|
||||
|
@ -65,21 +68,54 @@ class PyNamer:
|
|||
self.model = llm_config.get('model', 'gpt-4-vision-preview')
|
||||
self.max_tokens = llm_config.get('max_tokens', 100)
|
||||
self.temperature = llm_config.get('temperature', 0.7)
|
||||
|
||||
# Image processing settings
|
||||
image_config = self.config.get('image', {})
|
||||
self.resize_max_dimension = image_config.get('resize_max_dimension', 1024) # Default max dimension
|
||||
self.resize_format = image_config.get('resize_format', 'JPEG') # Default format after resize
|
||||
|
||||
logger.info(f"LLM setup complete. Using model: {self.model}")
|
||||
|
||||
def _encode_image(self, image_path: str) -> str:
|
||||
"""Encode image to base64 for API submission.
|
||||
|
||||
logger.info(f"Image resize settings: max_dimension={self.resize_max_dimension}, format={self.resize_format}")
|
||||
|
||||
def _resize_and_encode_image(self, image_path: str) -> str:
|
||||
"""Resize image if necessary and encode to base64 for API submission.
|
||||
|
||||
Args:
|
||||
image_path: Path to the image file
|
||||
|
||||
Returns:
|
||||
Base64 encoded image string
|
||||
"""
|
||||
with open(image_path, "rb") as image_file:
|
||||
return base64.b64encode(image_file.read()).decode('utf-8')
|
||||
|
||||
try:
|
||||
with Image.open(image_path) as img:
|
||||
# Calculate new size maintaining aspect ratio
|
||||
width, height = img.size
|
||||
if max(width, height) > self.resize_max_dimension:
|
||||
if width > height:
|
||||
new_width = self.resize_max_dimension
|
||||
new_height = int(height * (self.resize_max_dimension / width))
|
||||
else:
|
||||
new_height = self.resize_max_dimension
|
||||
new_width = int(width * (self.resize_max_dimension / height))
|
||||
|
||||
logger.debug(f"Resizing image from {width}x{height} to {new_width}x{new_height}")
|
||||
img = img.resize((new_width, new_height), Image.Resampling.LANCZOS)
|
||||
else:
|
||||
logger.debug("Image size is within limits, no resize needed.")
|
||||
|
||||
# Save resized image to a bytes buffer
|
||||
buffer = io.BytesIO()
|
||||
# Handle potential transparency issues when saving as JPEG
|
||||
if self.resize_format.upper() == 'JPEG' and img.mode in ('RGBA', 'P'):
|
||||
img = img.convert('RGB')
|
||||
img.save(buffer, format=self.resize_format)
|
||||
img_bytes = buffer.getvalue()
|
||||
|
||||
return base64.b64encode(img_bytes).decode('utf-8')
|
||||
except Exception as e:
|
||||
logger.error(f"Error processing image {image_path}: {e}")
|
||||
raise # Re-raise the exception to be caught by the caller
|
||||
|
||||
def _is_supported_format(self, file_path: str) -> bool:
|
||||
"""Check if the file format is supported.
|
||||
|
||||
|
@ -111,9 +147,12 @@ class PyNamer:
|
|||
return None
|
||||
|
||||
try:
|
||||
# Encode image
|
||||
base64_image = self._encode_image(image_path)
|
||||
# Resize and encode image
|
||||
base64_image = self._resize_and_encode_image(image_path)
|
||||
|
||||
# Determine the mime type based on the resize format
|
||||
mime_type = f"image/{self.resize_format.lower()}"
|
||||
|
||||
# Prepare messages for LLM
|
||||
system_message = self.config.get('prompt', {}).get('system_message', '')
|
||||
user_message = self.config.get('prompt', {}).get('user_message', '')
|
||||
|
@ -126,7 +165,7 @@ class PyNamer:
|
|||
{"type": "text", "text": user_message},
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}
|
||||
"image_url": {"url": f"data:{mime_type};base64,{base64_image}"}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
@ -271,4 +310,4 @@ def main():
|
|||
print(f"Failed to process: {image_path}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
main()
|
||||
|
|
|
@ -1,2 +1,3 @@
|
|||
litellm>=1.10.0
|
||||
pyyaml>=6.0
|
||||
Pillow>=9.0.0 # Added for image resizing
|
||||
|
|
1
setup.py
1
setup.py
|
@ -30,6 +30,7 @@ setup(
|
|||
install_requires=[
|
||||
"litellm>=1.10.0",
|
||||
"pyyaml>=6.0",
|
||||
"Pillow>=9.0.0",
|
||||
],
|
||||
python_requires=">=3.7",
|
||||
entry_points={
|
||||
|
|
|
@ -1,138 +0,0 @@
|
|||
Metadata-Version: 2.1
|
||||
Name: pynamer
|
||||
Version: 0.1.0
|
||||
Summary: Generate descriptive filenames for images using LLMs
|
||||
Home-page: https://github.com/yourusername/pynamer
|
||||
Author: Your Name
|
||||
Author-email: your.email@example.com
|
||||
Classifier: Development Status :: 3 - Alpha
|
||||
Classifier: Intended Audience :: Developers
|
||||
Classifier: License :: OSI Approved :: MIT License
|
||||
Classifier: Programming Language :: Python :: 3
|
||||
Classifier: Programming Language :: Python :: 3.7
|
||||
Classifier: Programming Language :: Python :: 3.8
|
||||
Classifier: Programming Language :: Python :: 3.9
|
||||
Classifier: Programming Language :: Python :: 3.10
|
||||
Requires-Python: >=3.7
|
||||
Description-Content-Type: text/markdown
|
||||
License-File: LICENSE
|
||||
Requires-Dist: litellm>=1.10.0
|
||||
Requires-Dist: pyyaml>=6.0
|
||||
|
||||
# PyNamer
|
||||
|
||||
PyNamer is a command-line tool that uses AI vision models to generate descriptive filenames for images. It analyzes the content of images and renames them with meaningful, descriptive filenames in snake_case format.
|
||||
|
||||
## Features
|
||||
|
||||
- Uses LiteLLM to integrate with various vision-capable LLMs (default: GPT-4 Vision)
|
||||
- Configurable via YAML config file
|
||||
- Supports multiple image formats (jpg, jpeg, png, gif, webp)
|
||||
- Dry-run mode to preview changes without renaming files
|
||||
- Handles filename collisions automatically
|
||||
|
||||
## Installation
|
||||
|
||||
### Option 1: Install from PyPI (recommended)
|
||||
|
||||
```bash
|
||||
pip install pynamer
|
||||
```
|
||||
|
||||
### Option 2: Install from source
|
||||
|
||||
1. Clone this repository
|
||||
2. Install the package in development mode:
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
### Set up your API key
|
||||
|
||||
You need to set up your API key for the vision model:
|
||||
|
||||
- Set the appropriate environment variable (e.g., `OPENAI_API_KEY`), or
|
||||
- Create a custom config file with your API key
|
||||
|
||||
## Configuration
|
||||
|
||||
PyNamer comes with a default configuration, but you can create a custom config file to customize:
|
||||
|
||||
- LLM provider and model
|
||||
- API key and endpoint
|
||||
- Supported image formats
|
||||
- Prompt templates for filename generation
|
||||
|
||||
Example custom config file (config.yaml):
|
||||
|
||||
```yaml
|
||||
llm:
|
||||
provider: "openai"
|
||||
model: "gpt-4-vision-preview"
|
||||
api_key: "your-api-key-here"
|
||||
max_tokens: 100
|
||||
temperature: 0.7
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
After installation, you can use PyNamer directly from the command line:
|
||||
|
||||
Basic usage:
|
||||
|
||||
```bash
|
||||
pynamer path/to/image.jpg
|
||||
```
|
||||
|
||||
Process multiple images:
|
||||
|
||||
```bash
|
||||
pynamer image1.jpg image2.png image3.jpg
|
||||
```
|
||||
|
||||
Use a different config file:
|
||||
|
||||
```bash
|
||||
pynamer -c custom_config.yaml image.jpg
|
||||
```
|
||||
|
||||
Preview changes without renaming (dry run):
|
||||
|
||||
```bash
|
||||
pynamer -d image.jpg
|
||||
```
|
||||
|
||||
Enable verbose logging:
|
||||
|
||||
```bash
|
||||
pynamer -v image.jpg
|
||||
```
|
||||
|
||||
## Example
|
||||
|
||||
Input: `IMG_20230615_123456.jpg` (a photo of a cat sleeping on a window sill)
|
||||
|
||||
Output: `orange_cat_sleeping_on_sunny_windowsill.jpg`
|
||||
|
||||
## Development
|
||||
|
||||
### Building the package
|
||||
|
||||
```bash
|
||||
pip install build
|
||||
python -m build
|
||||
```
|
||||
|
||||
### Installing in development mode
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
- Python 3.7+
|
||||
- LiteLLM
|
||||
- PyYAML
|
||||
- Access to a vision-capable LLM API (OpenAI, Anthropic, etc.)
|
|
@ -1,15 +0,0 @@
|
|||
LICENSE
|
||||
MANIFEST.in
|
||||
README.md
|
||||
pyproject.toml
|
||||
setup.py
|
||||
src/pynamer/__init__.py
|
||||
src/pynamer/cli.py
|
||||
src/pynamer/config.yaml
|
||||
src/pynamer/core.py
|
||||
src/pynamer.egg-info/PKG-INFO
|
||||
src/pynamer.egg-info/SOURCES.txt
|
||||
src/pynamer.egg-info/dependency_links.txt
|
||||
src/pynamer.egg-info/entry_points.txt
|
||||
src/pynamer.egg-info/requires.txt
|
||||
src/pynamer.egg-info/top_level.txt
|
|
@ -1,2 +0,0 @@
|
|||
litellm>=1.10.0
|
||||
pyyaml>=6.0
|
|
@ -1,3 +1,3 @@
|
|||
"""PyNamer - Generate descriptive filenames for images using LLMs."""
|
||||
|
||||
__version__ = "0.1.0"
|
||||
__version__ = "0.2.0"
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
"""Core functionality for PyNamer."""
|
||||
"#""Core functionality for PyNamer."""
|
||||
|
||||
import argparse
|
||||
import base64
|
||||
|
|
Loading…
Reference in New Issue