docs: add comprehensive documentation

- Add contributing guidelines to README - Add detailed architecture documentation - Add text analysis task documentation - Include usage examples and best practices
2025-01-14 21:05:33 +08:00 · 2025-01-14 21:05:33 +08:00 · 6143abc83b
commit 6143abc83b
parent 0cfcd3c576
3 changed files with 424 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -232,6 +232,100 @@ class CustomTaskExecutor(TaskExecutor):
 ### 测试
 pytest tests/test_config.py -v
 ## Contributing
 ### Development Setup
 1. Clone the repository:
 ```bash
 git clone http://gitea.towards-agi.cn/zhukang/agent-task-executor.git
 cd agent-task-executor
 ```
 2. Set up the development environment:
 ```bash
 # Install uv if not installed
 pip install uv
 # Create virtual environment and install dependencies
 uv venv
 uv pip install -e ".[dev]"
 ```
 3. Run tests:
 ```bash
 python -m pytest
 ```
 ### Code Style
 This project uses:
 - `black` for code formatting
 - `isort` for import sorting
 - `mypy` for type checking
 Format your code before committing:
 ```bash
 # Format code
 black .
 # Sort imports
 isort .
 # Type check
 mypy .
 ```
 ### Creating New Tasks
 1. Create a new task class in `tasksamples/` that inherits from `TaskExecutor`
 2. Define task steps in the constructor
 3. Implement step handlers
 4. Add tests in `tests/`
 Example:
 ```python
 from agent_task_executor.task_executor import TaskExecutor, TaskStatus
 class MyTaskExecutor(TaskExecutor):
    def __init__(self):
        super().__init__(llm_model="deepseek-chat")
        self.task_steps = [
            {
                "id": "step1",
                "name": "Step One",
                "required_info": ["input_data"],
                "instruction": "Process the input data."
            },
            # Add more steps...
        ]
    async def handle_step1(self, step_input: dict) -> dict:
        # Implement step logic
        return {"result": "processed data"}
 ```
 ### Pull Request Process
 1. Create a new branch for your feature
 2. Write tests for new functionality
 3. Update documentation as needed
 4. Submit a pull request with a clear description of changes
 ### Commit Messages
 Follow the [Conventional Commits](https://www.conventionalcommits.org/) specification:
 - `feat:` New features
 - `fix:` Bug fixes
 - `docs:` Documentation changes
 - `test:` Adding or updating tests
 - `refactor:` Code refactoring
 - `chore:` Maintenance tasks
 Example:
 ```
 feat: add new text classification task
 ```
 ## License
 MIT License
--- a/docs/architecture.md
+++ b/docs/architecture.md
@ -0,0 +1,155 @@
 # Architecture Overview
 ## Core Components
 ### TaskExecutor
 The `TaskExecutor` is the base class for all task implementations. It provides:
 1. **Task Step Management**
   - Sequential step execution
   - State tracking
   - Checkpoint creation
   - Error handling
 2. **LLM Integration**
   - Asynchronous API calls
   - Retry mechanisms
   - Response validation
 ### Configuration System
 1. **Config Loader**
   - YAML configuration files
   - Environment variable support
   - Configuration validation
 2. **Secure Configuration**
   - Encrypted storage for sensitive data
   - Key management
   - Secure API key handling
 ## Task Implementation
 ### Step Definition
 Each task is defined as a series of steps:
 ```python
 self.task_steps = [
    {
        "id": "step_id",
        "name": "Step Name",
        "required_info": ["required_data"],
        "instruction": "Step instruction for LLM"
    }
 ]
 ```
 ### Step Handlers
 Step handlers are implemented as async methods:
 ```python
 async def handle_step_id(self, step_input: dict) -> dict:
    # 1. Process input
    processed_data = self.preprocess(step_input)
    # 2. Call LLM if needed
    llm_response = await self.llm_call(
        instruction=step_input["instruction"],
        context=processed_data
    )
    # 3. Process response
    result = self.postprocess(llm_response)
    return {"result": result}
 ```
 ## Execution Flow
 1. **Initialization**
   ```python
   executor = TaskExecutor(llm_model="model_name")
   ```
 2. **Task Setup**
   ```python
   executor.task_steps = [...]
   ```
 3. **Execution**
   ```python
   result = await executor.execute(input_data)
   ```
 4. **Step Processing**
   - Validate input
   - Execute step handler
   - Create checkpoint
   - Handle errors
   - Move to next step
 5. **Completion**
   - Return final result
   - Clean up resources
 ## Error Handling
 1. **Retry Mechanism**
   - API call retries with exponential backoff
   - Configurable retry limits
 2. **Error Types**
   - `TaskExecutionError`: General execution errors
   - `StepValidationError`: Input validation failures
   - `LLMError`: LLM API related errors
 3. **Recovery**
   - Checkpoint-based recovery
   - State restoration
   - Partial results handling
 ## Best Practices
 1. **Task Design**
   - Keep steps atomic and focused
   - Clear step instructions
   - Proper input validation
   - Comprehensive error handling
 2. **LLM Usage**
   - Clear and specific prompts
   - Response validation
   - Handle token limits
   - Consider cost and latency
 3. **Testing**
   - Unit tests for each step
   - Integration tests for full flow
   - Mock LLM calls in tests
   - Test error scenarios
 4. **Security**
   - Secure API key handling
   - Input sanitization
   - Output validation
   - Access control
 ## Extension Points
 1. **Custom Step Handlers**
   - Implement custom logic
   - Add new capabilities
   - Integrate external services
 2. **LLM Providers**
   - Support multiple providers
   - Custom response parsing
   - Provider-specific optimizations
 3. **Monitoring & Logging**
   - Custom metrics
   - Logging handlers
   - Performance monitoring
--- a/docs/text_analysis.md
+++ b/docs/text_analysis.md
@ -0,0 +1,175 @@
 # Text Analysis Task
 ## Overview
 The Text Analysis Task is designed to perform comprehensive analysis of text content, with special support for Chinese text processing. It demonstrates the use of LLMs for various text analysis tasks including summarization, keyword extraction, and detailed analysis.
 ## Features
 1. **Input Validation**
   - UTF-8 encoding validation
   - Chinese character detection
   - Text length and format checks
 2. **Text Preprocessing**
   - Chinese punctuation normalization
   - Whitespace handling
   - Character standardization
 3. **Summary Generation**
   - Concise text summarization
   - Key point extraction
   - Main idea identification
 4. **Keyword Extraction**
   - Important term identification
   - Topic-related keyword extraction
   - Frequency and relevance analysis
 5. **Final Analysis**
   - Comprehensive text analysis
   - Structured report generation
   - Multi-aspect evaluation
 ## Usage
 ```python
 from agent_task_executor.tasksamples.text_analysis_task import TextAnalysisExecutor
 # Create executor instance
 executor = TextAnalysisExecutor()
 # Prepare input text
 text = """
 从ChatGPT到Devin:AI编程的四个发展阶段与范式转变。
 AI编程从ChatGPT出现到现在也就两年出头的时间，但已经经历了四个阶段...
 """
 # Execute analysis
 result = await executor.execute({"text": text})
 ```
 ## Implementation Details
 ### Step 1: Input Validation
 ```python
 async def handle_input_validation(self, step_input: dict) -> dict:
    """
    Validates input text for:
    - Non-empty content
    - Valid Chinese characters
    - Proper UTF-8 encoding
    """
 ```
 ### Step 2: Text Preprocessing
 ```python
 async def handle_text_preprocessing(self, step_input: dict) -> dict:
    """
    Preprocesses text by:
    1. Normalizing Chinese punctuation
    2. Handling whitespace
    3. Standardizing characters
    """
 ```
 ### Step 3: Summary Generation
 ```python
 async def handle_generate_summary(self, step_input: dict) -> dict:
    """
    Generates text summary using LLM:
    1. Extracts main points
    2. Creates concise summary
    3. Maintains key information
    """
 ```
 ### Step 4: Keyword Extraction
 ```python
 async def handle_extract_keywords(self, step_input: dict) -> dict:
    """
    Extracts keywords:
    1. Identifies important terms
    2. Analyzes frequency and relevance
    3. Returns structured list
    """
 ```
 ### Step 5: Final Analysis
 ```python
 async def handle_final_analysis(self, step_input: dict) -> dict:
    """
    Performs comprehensive analysis:
    1. Combines all previous results
    2. Generates structured report
    3. Provides detailed insights
    """
 ```
 ## Configuration
 The task uses the following LLM configuration:
 ```yaml
 llm:
  provider: deepseek
  model: deepseek-chat
  temperature: 0.7
  max_tokens: 2000
 ```
 ## Error Handling
 1. **Input Errors**
   - Invalid encoding
   - Empty text
   - Non-Chinese content
 2. **Processing Errors**
   - LLM API failures
   - Token limit exceeded
   - Response parsing errors
 3. **Output Validation**
   - Result structure validation
   - Content quality checks
   - Format verification
 ## Best Practices
 1. **Text Input**
   - Proper encoding (UTF-8)
   - Reasonable text length
   - Clean input formatting
 2. **LLM Prompts**
   - Clear instructions
   - Specific requirements
   - Example outputs
 3. **Result Processing**
   - Validate all outputs
   - Handle edge cases
   - Maintain text quality
 ## Extensions
 1. **Language Support**
   - Add support for other languages
   - Language detection
   - Multi-language analysis
 2. **Analysis Types**
   - Sentiment analysis
   - Topic classification
   - Entity recognition
 3. **Output Formats**
   - Custom report formats
   - Export options
   - Integration capabilities