Add filter pipeline core infrastructure (Phase 1)
Implements plugin-based content filtering system with multi-level caching: Core Components: - FilterEngine: Main orchestrator for content filtering - FilterCache: 3-level caching (memory, AI results, filterset results) - FilterConfig: Configuration loader for filter_config.json & filtersets.json - FilterResult & AIAnalysisResult: Data models for filter results Architecture: - BaseStage: Abstract class for pipeline stages - BaseFilterPlugin: Abstract class for filter plugins - Multi-threaded parallel processing support - Content-hash based AI result caching (cost savings) - Filterset result caching (fast filterset switching) Configuration: - filter_config.json: AI models, caching, parallel workers - Using only Llama 70B for cost efficiency - Compatible with existing filtersets.json Integration: - apply_filterset() API compatible with user preferences - process_batch() for batch post processing - Lazy-loaded stages to avoid import errors when AI disabled Related to issue #8 (filtering engine implementation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
27
filter_config.json
Normal file
27
filter_config.json
Normal file
@@ -0,0 +1,27 @@
|
||||
{
|
||||
"ai": {
|
||||
"enabled": false,
|
||||
"openrouter_key_file": "openrouter_key.txt",
|
||||
"models": {
|
||||
"cheap": "meta-llama/llama-3.3-70b-instruct",
|
||||
"smart": "meta-llama/llama-3.3-70b-instruct"
|
||||
},
|
||||
"parallel_workers": 10,
|
||||
"timeout_seconds": 60,
|
||||
"note": "Using only Llama 70B for cost efficiency"
|
||||
},
|
||||
"cache": {
|
||||
"enabled": true,
|
||||
"ai_cache_dir": "data/filter_cache",
|
||||
"filterset_cache_ttl_hours": 24
|
||||
},
|
||||
"pipeline": {
|
||||
"default_stages": ["categorizer", "moderator", "filter", "ranker"],
|
||||
"batch_size": 50,
|
||||
"enable_parallel": true
|
||||
},
|
||||
"output": {
|
||||
"filtered_dir": "data/filtered",
|
||||
"save_rejected": false
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user