The ultimate PyQt6 application that integrates the power of OpenAI, Google Gemini, Claude, and other open-source AI models
# MyChatGPT
The ultimate PyQt6 application featuring the power of OpenAI, Google Gemini, Gemini, and various open-source AI models.
It delivers outstanding capabilities for Chat, Image, Vision, Text-To-Speech(TTS) and Speech-To-Text(STT).
## What's New
- Enhanced Chat Capabilities:
Utilize a variety of file formats, including documents, images, audio, and video files. (Please note: Ensure the selected model supports these formats for optimal performance.)
- OpenAI Supported File Types
- Document: 'pdf', 'doc', 'docx', 'pptx'
- Image: 'jpeg', 'jpg', 'png', 'gif', 'webp'
- Text: Plain text format files
- Gemini Supported File Types
- Document: 'pdf', 'rtf', 'docx', 'doc', 'epub'
- Image: 'jpeg', 'jpg', 'png', 'gif', 'webp'
- Text: Plain text format files
- Gemini Supported File Types
- Document: 'pdf', 'rtf', 'doc', 'docx', 'dot', 'dotx', 'hwp', 'hwpx'
- Image: 'jpeg', 'jpg', 'png', 'gif', 'webp'
- Video: 'x-flv', 'quicktime', 'mpeg', 'mpegs', 'mpg', 'mp4', 'webm', 'wmv', '3gpp'
- Audio: 'x-aac', 'flac', 'mp3', 'm4a', 'mpeg', 'mpga', 'mp4', 'opus', 'pcm', 'wav', 'webm'
- Text: Plain text format files
- Note: Some file types are only supported for Google AI Pro or Google AI Ultra subscribers. Learn how to upgrade to Google AI Pro or Ultra.
- The link at https://support.google.com/gemini/answer/14903178?hl=en says that hwp/hwpx files are supported, but when tested, the following error occurs.
- This is the detailed error message when tested with the two MIME types: application/vnd.hancom.hwp and application/x-hwp.
```
400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': 'Unable to submit request because it has a mimeType parameter with value application/vnd.hancom.hwp, which is not supported. Update the mimeType and try again. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini', 'status': 'INVALID_ARGUMENT'}}
400 INVALID_ARGUMENGoogle's AI-powered research notebook that ingests your documents and becomes an expert on your content. Generates audio overviews, study guides, FAQs, and interactive discussions from uploaded sources.
Google DeepMind's experimental AI agent that can navigate websites, fill forms, and complete multi-step browser tasks autonomously. Uses Gemini's multimodal understanding to interact with web interfaces.
Google DeepMind's universal AI assistant prototype that can see, hear, and respond in real-time through your device camera and microphone. Demonstrates the future of multimodal AI interaction.
Google Cloud's enterprise platform for building, deploying, and managing AI agents powered by Gemini. Supports multi-agent orchestration, tool integration, and enterprise governance.
Gemini's agentic research capability that autonomously browses the web, synthesizes information from dozens of sources, and produces comprehensive research reports on any topic.
Interactive coding and content creation agent that generates, previews, and iterates on code, documents, and interactive applications in a side panel. Supports HTML/CSS/JS, Python, and more.