Changelog
This update introduces the Confluence tool for collaboration, improves tool compatibility, and resolves deep copy issues for Ollama chat agents.
New Features:
- Confluence Tool: Added a new tool using the Atlassian Confluence SDK, enabling operations such as listing pages in a space, creating and updating pages, retrieving page content by title, and getting space details.
Improvements:
- Tool Compatibility: Enhanced older custom functions with manually specified descriptions and parameters to align with the updated tool-building system.
Bug Fixes:
- Deep Copy for Ollama Chat Agents: Addressed an issue where manually set clients caused errors during agent model copying, ensuring all properties are properly handled.
This release enhances multimodal capabilities with audio support, improves session page performance, and fixes various bugs for better stability and usability.
New Features
- Audio Response Support: Added support for audio responses, enhancing multimodal interaction capabilities.
- Audio Generation Tools: Integrated Eleven Labs for Audio Generation.
- Cohere Embedder: Introduced a new Cohere Embedder class with a corresponding cookbook example to demonstrate its usage.
- File Agent Storage Support: Added JSON and YAML as storage options for agent session data.
Improvements
- Version Checker for OpenAI: Added a warning for users with OpenAI versions below 1.52.0 to ensure compatibility with features like audio in ChatCompletionMessage.
- Agent Response Handling: Enhanced processing of agent responses to support lists, improving handling of multi-item outputs.
Bug Fixes
- AWS Bedrock Tool Descriptions: Fixed an issue where the transfer tool description was missing, causing incompatibility with AWS Bedrock Claude.
- Response Content Handling: Resolved crashes on the session page caused by non-string response content.
- Deep Copy Agent Memory: Addressed deep copy errors when using agent memory on the playground.
- Session Page Enhancements: Fixed the refresh button.
- Fix Tool Parsing for Ollama: Fixed JSON schema tool parsing by transforming ['string', 'null'] parameters to 'string' for compatibility.
- Response Parsing for Gemini Tool: Improved response parsing to handle unserializable objects in tool_calls for Gemini on the playground.
- Memory Handling for Google Provider: Fixed an issue in monitoring_data where memory was removed for all providers, causing blank titles on Phidata.app; now only modifies memory for Google provider.
- RecursiveChunking ID Conflict: Resolved an issue in RecursiveChunking where processing large files with multiple chunks caused duplicate chunk record IDs, leading to psycopg.errors.UniqueViolation.
This update introduces new multimodal tools, enhances the multimodal capabilities of existing models, and includes several quality-of-life improvements.
New Features
- Giphy Tool: Added Giphy integration to enhance creative collaboration.
- Native image upload support for Claude: Added native support for uploading images natively to the Anthropic API.
- Youtube Knowledge base: Added support for new YouTube knowledge base, allowing it to be loaded directly using YouTube video links.
Improvements
- API key validation: Added API key validation for model classes.
- Gemini audio: Improved native audio upload support for Gemini Model.
- YoutubeTools: A new Youtube tool which allows generation of timestamps.
- Workspace Configuration Flexibility: Refactored type-checking logic using isinstance() to enhance flexibility and maintainability.
- Web Crawler Stability: Switched to crawl4ai async to resolve issues and improve performance.
- Memory Optimization: Improved memory usage in large-scale workflows for better efficiency.
- User Interface: Refined the UI for better tracking of team activities.
Bug Fixes
- Role-Based Access Control Bugs: Resolved issues affecting access permissions.
- Gemini Functions: Fixed errors when functions had no specified parameters.
Improvements
- Improvements to Gemini Multimodality
New Features
- Gemini Multimodal Support: Added support for multimodal (image, video, text) input processing with Gemini.
- Mem0 Integration Example: Introduced a cookbook example demonstrating Mem0 integration for Agent memory.
- CSV URL Knowledgebase: Added functionality to create and manage knowledgebases from CSV URLs.
Improvements
- Vertex AI Gemini 2 Update: The Vertex AI class has been updated to support the Gemini 2 model.
Bug Fixes
- Structured Output Fix: Resolved issues with Ollama structured output to ensure consistent data formatting.
This update introduces image and video multi-modal support for the agent playground and adds image and video generation tools like FAL, replicate, and ModelLabs.
Highlights
- Support for video: Agents now natively support video.
- Multi-Modal rendering on Agent Playground: Agent Playground can now render images and videos.
New Feature
- New Tools: Added Replicate, FAL, and ModelLabs tools to generate video and images.
Improvements
- Various cookbook examples were added to cover real-world agent use cases.
This update introduces exciting new features, performance enhancements, and crucial fixes to ensure better usability and functionality.
Highlights
- Multimodal Agents: Agents now natively support image and audio with video coming soon.
- Agentic Workflows are now generally available: Build deterministic multi-agent pipelines using Workflows.
- Agent can share state between tool calls: New session_state variable allows Agents to maintain and share state across function calls.
- Agents in teams can respond directly to the user: now responses of team members do not need to go through the team leader.
- Context Injection: Agents can be provided context that is resolved in real-time during execution.
- Human-in-the-loop: Tool calls can now be confirmed (or approved) before being executed.
- Advanced chunking: Support for Semantic and Agentic chunking (and more).
New Features
- Show Reasoning in the Playground: Provides users with insights into the agent's thought process and decision-making, offering a behind-the-scenes look at how the agent operates within the Playground.
- Show Tool Call in the Playground: Display tool call results and metrics on hover within the Playground.
- Sessions Page Enhancements: Added context, reasoning, and tools integration to the Sessions page.
- Delete Custom Endpoint for Playground: Introduced the ability to delete endpoints directly within the Playground.
- Show References in the Playground: You can now view the sources used for RAG.
- Chunking Strategies in RAG: Introduced five levels of chunking strategies in RAG: Semantic Chunking, Fixed Chunking, Agentic Chunking, Document Chunking, and Recursive Chunking.
- Unified Reranker for Vector Databases: Implemented a unified reranking feature across various vector databases, including LanceDB, with support for CohereReranker.
- Milvus Vector Database Integration: Added support for Milvus as a vector database option.
- Support for Multi-Modalities: Added support for processing audio, video, and image data to enhance agent capabilities.
- Context Injection: Improved context handling to enhance agent responses and usability.
- Pre-hook and post-hook for function calls: enabling users to validate arguments, add human-in-the-loop flows and validate results of tool calls.
Improvements
- Duplicate Endpoint Prevention: Enhanced custom endpoint creation logic to prevent duplicates.
- Workflow Session Logging: Improved logging mechanisms for workflow sessions.
- Ollama Tool Call Streaming: Updated the Ollama tool to improve call streaming capabilities.
- Logging Updates for Gemini: Enhanced logging functionality for Gemini models.
- Ollama LLM Class Updates: Resolved issues in tool calling to improve usage reliability.
- Product Manager Agent Workflow Example: Added a new example to demonstrate the practical use and capabilities of workflow.
- Dynamic Prompts: System prompt, user prompt and instructions can now be passed as functions resolved during run-time.
Bug Fixes
- Session Read-All Fix: Fixed an issue where titles were not created when users provided a list of messages instead of a single message.
- Ollama Tool Response Issue: Addressed inconsistencies where Ollama tools always returned empty responses.
Breaking Changes
- `Agent.add_context` is now `Agent.add_references` as the context terminology is now used for context injection. Similarly, `context_format` is now `references_format`.
This update is packed with new integrations, significant improvements to existing tools, and crucial fixes to enhance functionality and user experience.
New Features
- InternLM Support: Added feature request integration for InternLM25.
- Google Vertex AI Integration: Expanded compatibility with Google Model Vertex AI.
- Discord Integration Tool: Seamlessly connect and collaborate within Discord.
- Baidu Search Tool: Introducing Baidu Search integration for enhanced search capabilities in Phidata workflows.
Improvements
- Agent Monitoring Sync: Optimized synchronization between Reasoning Agent and Main Agent.
- Knowledge Cookbooks: Improved usability with enhanced S3, embedder, and document cookbooks; added an example using LanceDB as a vector database.
- License Clarifications: Updated license information for better transparency and understanding .
- Default Model Adjustments: Refined the default model example for OpenAI; new guidance to avoid errors when changing models.
- Qwen2.5 Coding Agent: Performance enhancements and stability improvements.
- Installation Improvements: Added support for "psycopg[binary]" in installation commands.
Bug Fixes
- Phi Tool Templates: Integrated new templates and added a comprehensive cookbook.
- Knowledge Base Cookbooks: Addressed inconsistencies for a seamless experience.
- Context Creation: Fixed an issue where empty contexts couldn't be created without a knowledge base.
- Ollama Knowledge Example: Resolved errors for improved functionality and reliability.