Bỏ qua đến nội dung

Test Your Pronunciation (Cloud Speech to Text)

Chờ xử lý #anki #addon #test #your #pronunciation #cloud #speech
https://github.com/rroessler1/speech-to-text
24/2/2024

Cách tải addon Test Your Pronunciation (Cloud Speech to Text)

Bạn có thể tải addon bằng một trong hai cách sau:

Click nút Copy bên dưới để copy code vào clipboard

673333980

Sau đó mở Anki → Tools → Add-ons → Get Add-ons → Dán code → OK

Mở trang addon trên AnkiWeb và tìm mã code ở cuối trang

Mở trên AnkiWeb

Cuộn xuống cuối trang AnkiWeb, tìm dòng có mã code 673333980 và copy

14

Mô tả chi tiết

This add-on tests your pronunciation by recording your voice, analyzing it using a Speech-to-Text (STT) service, and then comparing it with the value on the current card.

It currently supports Google and Microsoft Cloud Speech-to-Text services.

HOW TO USE

You need either a Google or Microsoft API key, which may cost money, but Microsoft has a very generous free tier. Here are instructions for both:

Google

Follow the instructions here: https://cloud.google.com/speech-to-text/docs/quickstart-client-libraries, which are summarized below.

Basically,

1.1. Go here: https://cloud.google.com/speech-to-text/docs/quickstart-client-libraries and click the “Set up a Project” button.

1.2. Follow the steps to create a project and add a payment method.

Create an API Key. (Google will automatically generate a “Service Account Key” but this is not what you want.)

2.1. Go to the developers console: https://console.developers.google.com/

2.2. Click on “Credentials”

2.3. Click on “Create Credentials -> API Key” and copy the value.

Microsoft

Create a Microsoft Portal Account http://portal.azure.com

Create a “Speech” by Microsoft resource here: https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices (None of the naming / settings are particularly important)

Once that’s done, click on “Go to Resource”

Click on “Keys and Endpoint”

Click on “Show Keys”. These are your two API keys. Use either.

Take note of “location”, you’ll need it in the configuration.

Once You Have an API Key

Install this add-on, then configure it in Anki. Go to Tools -> Test Your Pronunciation Settings

Select which service you will use (Google or Microsoft)

Enter the API Key you created above.

2.1 If using Microsoft, you also need to select the API location (explained above).

Choose which language you will be pronouncing.

Enter the Field Name you will read. This is used to compare your voice to the actual value on the card. (If you are unfamiliar with Anki fields it is the number one feature you should understand, check it out here: https://docs.ankiweb.net/#/getting-started?id=notes-amp-fields )

Whenever you study a card, you can go to “Tools -> Test Your Pronunciation” (or press Ctrl + Shift + S) to activate the plugin. Record your voice and then view the results.

Cloud Speech-to-Text Pricing

Please note this is the price charged by Google and Microsoft to use their Cloud Speech-to-Text services, and is completely out of my control.

Google

The current pricing details are here: https://cloud.google.com/speech-to-text/pricing

The first 60 minutes per month are free, and then it’s $0.004 or $0.006 USD per 15 seconds (rounded up at 15 second increments), depending on if you enable data logging.

If you test your pronunciation on 100 cards every day (and each audio clip is 15 seconds or less), then it will cost you: (3000 cards per month - 240 (free tier)) * $0.006 = $16.56 USD / month. Or $11.04 if you enable data logging.

Microsoft

Pricing details are here: https://azure.microsoft.com/en-us/pricing/details/cognitive-services/speech-services/

On the Free Tier, you get 5 free hours of audio. If you use more than this, you have to manually upgrade to the standard tier, which costs $1 per hour.

If you test your pronunciation on 100 cards every day, and the average clip length is 6 seconds, then it will be Free.

If the average clip length is 15 seconds, it will cost you $7.50 USD per month.

30 days * 100 cards * 15 seconds / 3600 seconds in an hour = 12.5 hours - 5 free hours = $7.50 USD

Punctuation

The plugin ignores punctuation when analyzing the card and your speech, so if the card reads: “Hello! How are you?” You can say “Hello how are you” and it will be correct. (You can also say “Hello exclamation mark how are you question mark”, and that will work too.)

But one limitation of this is that decimal points are also removed, so if the card reads: “3.5” You can say “Three point five” or “Thirty-five”, and both will be marked as correct

Using this with Chinese

This plugin will show your results in Pinyin, to make it easier for beginners to see what they got wrong, and whether it was a tone issue or not.

Note on Speech-to-Text Accuracy, especially with Chinese

How accurate is Speech-to-Text?

I initially developed this plugin to help me practice Chinese; I’m a beginner learner and my pronunciation is quite bad, something many learners struggle with.

Olle over at Hacking Chinese did a not-statistically-significant analysis, and found that Google’s Chinese Speech-to-Text is basically perfect: https://www.hackingchinese.com/using-speech-recognition-to-improve-chinese-pronunciation-part-1/

My Chinese friends (again, not statistically significant) also report that Google “always” gets their Chinese dictation correct.

This initially seems impossible given the “difficulty” of Chinese pronunciation, but if you think again it intuitively makes sense. The tones, while difficult for foreigners unfamiliar with tonal languages, provide an extra signal for the algorithm that makes it easier to identify the syllable/word. Native speakers who can produce this perfectly see accurate results.

Now of course, even if you don’t say things perfectly, the algorithm tries to figure it out. So a reasonable conclusion is that: If the computer thinks you said it wrong, you did, but if the computer thinks you said it right, it still might sound weird to a native. At least for Chinese (and presumably other tonal languages, though I haven’t checked).

What about for other languages?

Well, Google is certainly not 100% accurate when I do Speech-to-Text for English, but if I speak a bit slow and clearly, I would say it’s quite good.

Issues / Feedback

Please submit any issues on Github: https://github.com/rroessler1/speech-to-text

Donate

If you find my work helpful, I would be honored with any kind of donation, as it does take a very surprisingly long time to develop software available for public use.

https://www.buymeacoffee.com/rroessler

If you like it, please comment here or send me feedback!


Liên kết hỗ trợ


Reviews (13)

👍 2024-12-14

Just what I was looking for!

👍 2024-09-25

this is excellent! in learning languages ​​pronunciation is important and it makes it easier to remember phrases, for me 10/10!

👍 2024-05-22

Inputting Cantonese speech is returning Mandarin answers for both Google and Microsoft. So for example 給 should be kap1 in Cantonese Jyutping, but add-on says it’s incorrect because it thinks it should be gei3, which is the Mandarin pinyin pronunciation of the character. Chinese (Cantonese, Traditional) has been selected in the add-on.

👍 2024-03-27

Create addon, but unfortunately, Germany West Central as a region is missing from the dropdown (and I can only create services there with my company’s subscription)

👍 2024-02-25

Fantastic add on

👍 2023-05-27

Very nice plugin. I use it for learning Bible verses.

👍 2022-11-18

thank you. great add on

👍 2021-08-21

Please help me. In my case, this error appeared:

Erro

Ocorreu um erro. Por favor, inicie o Anki enquanto segura a tecla shift, isto vai desabilitar temporariamente as extensões que você instalou.

Se o problema ocorrer somente quando as extensões são habilitadas, use o Ferramentas > Extensões para desabilitar algumas extensões e reinicie o Anki, repetindo até descobrir a extensão que está causando o problema.

Quando você descobrir a extensão que está causando o problema, por favor reporte o p

👍 2021-07-14

Interesting… As an alternative, I recommend https://audext.com/. It works quite fast, and it has many useful features such as an in-built editor, text timings tracking, voice recognition in noise, etc. It has the first 30 minutes of transcribing for free and supports many languages, including Chinese.

👍 2021-03-31

works great

👍 2021-03-09

Works great for Chinese, it’s very helpful that it also shows Pinyin.

👍 2021-02-16

Very good, great idea

👍 2020-12-16

Super! now with this Addon anki can one read, speak write and listen!!!