Integrate Spring AI for intelligent file search, summarization, and classification #346
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
spring-ai-openai-spring-boot-starterandspring-ai-ollama-spring-boot-starter1.0.0stable release (from milestone1.0.0-M4)com.jmal.clouddisk.ai)AiProperties.java- AI configuration properties with provider, API key, model settingsAiConfig.java- AI main configuration class supporting OpenAI and OllamaEmbeddingService.java- Generate text vectors with chunking for large textsVectorStoreService.java- Store and search vectors with Lucene KnnVectorFieldAiSearchService.java- Natural language and semantic/hybrid searchFileSummaryService.java- Generate file summaries with AIFileClassificationService.java- Auto classify and tag filesFileSummaryVO.javaFileClassificationVO.javaTagSuggestionVO.javaAiController.javawith all endpoints under/api/aisummaryfield to FileDocumentsummaryfield to FileMetadataDOOriginal prompt
需求概述
在现有的 JmalCloud 私有云存储项目中集成 Spring AI,实现以下 AI 智能功能:
技术实现要求
1. 添加 Spring AI 依赖配置
在
pom.xml中添加以下依赖:spring-ai-openai-spring-boot-starter- OpenAI 支持spring-ai-ollama-spring-boot-starter- Ollama 本地部署支持2. 创建 AI 配置类
创建
src/main/java/com/jmal/clouddisk/ai/包,包含:AiConfig.java- AI 主配置类,支持多种 LLM 提供商切换AiProperties.java- AI 配置属性类,包含:provider- 提供商类型(openai/ollama)apiKey- API 密钥baseUrl- API 基础 URLmodel- 模型名称embeddingModel- 向量模型名称temperature- 温度参数maxTokens- 最大 token 数enabled- 是否启用 AI 功能vectorDimension- 向量维度(默认 1536)3. 实现向量存储服务 - 基于 Lucene 的向量索引
创建以下类:
VectorStoreService.java- 向量存储服务KnnVectorField存储文档向量LuceneService的索引流程中集成向量化KnnVectorQuery)storeVector(String fileId, float[] vector)- 存储向量searchSimilar(float[] queryVector, int topK)- 相似度搜索deleteVector(String fileId)- 删除向量EmbeddingService.java- 向量生成服务embed(String text)- 生成文本向量embedBatch(List<String> texts)- 批量生成向量4. 实现 AI 搜索服务 - 自然语言查询转换
创建
AiSearchService.java:naturalLanguageSearch(String query, String userId)- 自然语言搜索semanticSearch(String query, String userId, int topK)- 语义向量搜索hybridSearch(String query, String userId)- 混合搜索(关键词+向量)提示词模板示例:
5. 实现文件总结服务 - 自动生成文件摘要
创建
FileSummaryService.java:generateSummary(String fileId)- 生成文件摘要getSummary(String fileId)- 获取已有摘要regenerateSummary(String fileId)- 重新生成摘要batchGenerateSummary(List<String> fileIds)- 批量生成提示词模板:
6. 实现智能分类服务 - 自动分类和标签
创建
FileClassificationService.java:classifyFile(String fileId)- 分类文件suggestTags(String fileId)- 建议标签autoTagFile(String fileId)- 自动打标签batchClassify(List<String> fileIds)- 批量分类预定义分类:
提示词模板:
7. 添加 REST API 接口
创建
AiController.java(路径:/api/ai):8. 数据模型扩展
创建新的 VO/DTO 类:
FileSummaryVO.java- 文件摘要响应FileClassificationVO.java- 文件分类响应AiStatusVO.java- AI 状态响应AiConfigDTO.java- AI 配置请求在数据库中添加新字段(通过 MongoDB 文档或扩展现有模型):
summary- 文件摘要aiCategory- AI 分类aiTags- AI 建议的标签embedding- 向量(可选,也可以存在 Lucene 中)9. 在现有索引流程中集成向量化
修改
LuceneService.java:updateIndex方法中,当读取文件内容后,调用EmbeddingService生成向量KnnVectorField)This pull request was created as a result of the following prompt from Copilot chat.
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.