Bỏ qua đến nội dung

Gemini Speak

Chờ xử lý #anki #addon #gemini #speak
https://github.com/jrherediaramirez/Gemini-TTS-Anki
11/6/2025

Cách tải addon Gemini Speak

Bạn có thể tải addon bằng một trong hai cách sau:

Click nút Copy bên dưới để copy code vào clipboard

480539677

Sau đó mở Anki → Tools → Add-ons → Get Add-ons → Dán code → OK

Mở trang addon trên AnkiWeb và tìm mã code ở cuối trang

Mở trên AnkiWeb

Cuộn xuống cuối trang AnkiWeb, tìm dòng có mã code 480539677 và copy

11
2

Mô tả chi tiết

Gemini Speak - Intelligent TTS for Anki

Next-generation, professional Text-to-Speech for Anki using the Google Gemini API. This add-on goes beyond simple text-to-audio by using AI-powered analysis to rewrite and adapt your text for the most natural and context-aware speech possible.

It remains fully compatible with all platforms (Windows, macOS, Linux) with zero external dependencies.

✨ Features AI-Powered Preprocessing: In “Unified” mode, a powerful Gemini model analyzes and rewrites your text—turning bullet points into flowing sentences and simplifying complex structures—before generating audio for incredibly natural speech. Intelligent Content-Awareness: Automatically detects your content type—including instructions, feature lists, and technical code—to apply the best conversion strategy. Multiple Processing Modes: Choose your preferred method: Unified: The default AI-powered mode for the best quality. Traditional: A direct, fast TTS-only conversion. Hybrid & Auto: Intelligently switches between modes based on text complexity. Advanced Controls: Fine-tune performance with settings like “Thinking Budget” to control AI reasoning and “Preprocessing Style” (e.g., Natural, Professional, Technical). Dynamic In-Editor UI: Quickly switch the voice, model, and processing mode directly from the Anki editor toolbar. No Dependencies & Universal Compatibility: Uses only Python’s built-in libraries, ensuring it works flawlessly everywhere, including on Linux Flatpak/Snap where other add-ons fail. 30+ Premium Voices & Smart Caching: Access Google’s high-quality voice collection and save on API calls with an intelligent local cache. 🚀 Installation Method 1: AnkiWeb (Recommended) Open Anki. Go to Tools > Add-ons > Get Add-ons… Enter add-on code: 480539677 Click OK and restart Anki. Method 2: Manual Installation Download the latest release from the official repository. Extract the “Gemini Speak” folder into your Anki add-ons folder. Restart Anki. ⚙️ Setup Get a Gemini API Key: Visit ai.google.dev and create an API key. Configure the Add-on: In Anki, go to Tools > Gemini TTS Configuration. Paste your API key and review the settings across the “Basic”, “Advanced”, and “Processing” tabs. Click “Test API Key” to verify, then “Save”. 📖 Usage Open the Anki note editor. Select the text you want to convert to speech. Press Ctrl+G or click the Gemini icon in the toolbar. Audio will be added to the field you are currently editing. Use the Mode, Model, and Voice buttons in the toolbar to change settings on the fly for the next generation. 🔧 Configuration Options

The configuration dialog (Tools > Gemini TTS Configuration) provides extensive control.

| Setting | Tab | Description | Default | | :--- | :--- | :--- | :--- | | API Key | Basic | Your Gemini API key. | (required) | | Model | Basic | Selects the Gemini model (Unified models are for AI preprocessing). | Gemini 2.5 Flash (Unified) | | Processing Mode | Basic | Choose how text is processed (Unified, Traditional, Hybrid, Auto). | Unified | | Voice | Basic | The TTS voice name. | Zephyr | | Temperature | Basic | Controls randomness/creativity of speech; 0.0 is deterministic. | 0.0 | | Thinking Budget | Advanced | Tokens for AI reasoning. Higher values handle more complex text. | 0 | | Enable Cache | Advanced | Enables/disables caching of generated audio. | On | | Cache Days | Advanced | How long to keep cached audio files. | 30 days | | Enable Fallback | Advanced | Automatically falls back to Traditional mode if Unified mode fails. | On | | Preprocessing Style| Processing| The style the AI should use for rewriting text. | Natural | | Auto-detect Content| Processing| Enables automatic detection of content type for better processing. | On |

🛠️ Troubleshooting “Invalid API key”: Ensure your key is correct and has access to the Gemini API. Test it in the configuration window. “Rate limited”: You have exceeded the free tier’s API usage. Wait a moment and try again. “Field not found”: The add-on automatically adds audio to the field your cursor is in. Ensure your note type has the desired field. Network/Connection Issues: Check your internet connection and firewall settings. 🔒 Privacy & Security API Key: Stored locally in Anki’s configuration files on your computer. Text Data: The text you select is sent to Google’s Gemini API for processing. Audio Cache: Generated audio is stored locally in your Anki profile’s media folder. No Tracking: This add-on does not collect or transmit any personal usage data. 📄 License

MIT License - see LICENSE file for details.


Liên kết hỗ trợ


Reviews (11)

👍 2025-12-07

works like a charm!

👍 2025-11-16

Great but after recording 10 audio files, it throws an “error due to timeout” error. That message appears even in version 2.5 of Gemini

👍 2025-08-26

works amazing! The quality of the speech is so good!

👍 2025-08-16

Thank you! Now it’s working. Plz, add batch operations when you’ll have free time for it :)

👍 2025-07-21

still working, thanks

👍 2025-07-17

works great on linux! would love an option to move the audio to another field :)

👍 2025-07-12

Very good tool! One question just as another user has mentioned, can we configure target language? Sometimes a card only contains one word and it’s too short to indentify which language it belongs to. Also it now can only include 1 audio per card. But I have some cards with multiple example sentences. They are separate and have meaning explanation following them not in the target language.

👍 2025-06-19

“Add-on has no configuration.”

👍 2025-06-11

After saving the configuration restart anki is required The voice is awesome, very good add on for language learning, but I have several note types for diffrent languages, anyway to choose the target field depend on the note type? For eg, we can create several profiles like hyperTTS for diffrent note type and target field?

👍 2025-06-10

on 2025-06-10 The previous issues have been resolved, it is an awesome plugin, check it out!. nwn

👍 2025-06-08

Super!