Skip to content

Releases: Uberi/speech_recognition

Version 3.6.4

13 Apr 06:08
3.6.4
Compare
Choose a tag to compare

Bugfix release!

  • Fix tempfile.NamedTemporaryFile on Windows, by replacing it with a PortableNamedTemporaryFile class. Previously, it didn't necessarily support the file being re-opened after originally opened.
  • Documentation/troubleshooting improvements (thanks @hassanmian!).
  • Add support for 24-bit FLAC audio files (thanks @sudevschiz!).
  • Fix phrase_time_limit being ignored for listen_in_background (thanks @dodysw!)
  • Added lots of new audio regression tests.
  • Code cleanup for tests and examples.

Version 3.6.3

11 Mar 22:00
3.6.3
Compare
Choose a tag to compare

Small bugfix release:

  • Handle case when GSR doesn't return a confidence value (thanks @jcsilva!).
  • Config, style, and release improvements.
  • Fix console window sometimes popping up when on Windows (thanks @Qdrew!)
  • Switch release over to universal Wheels rather than source distribution.

Version 3.6.0

07 Jan 04:50
3.6.0
Compare
Choose a tag to compare

This is more of a maintenance release, but a few features slipped in as well:

  • Support for the Google Cloud Speech API with recognizer_instance.recognize_google_cloud (thanks @Thynix!), plus documentation and examples.
  • Automatic sample rate detection in speech_recognition.Microphone - this should fully resolve all the "Invalid sample rate" issues from PyAudio.
  • Project now has automated tests and continuous integration with TravisCI. It's pretty nifty, and has already caught a few things during development!
  • Keywords example for recognizer_instance.recognize_sphinx.
  • Documentation improvements and updated advice in troubleshooting and library reference.
  • Bugfix - Google Speech Recognition sometimes didn't return the text with the highest confidence (thanks @akabraham!).
  • Bugfix - EOFError upon encountering malformed audio files; a proper exception message is now given.
  • Updated FLAC binaries for OS X.
  • Bugfix - invalid FLAC binary path on OS X (thanks @akabraham!).
  • Code cleanup.

Version 3.5.0

21 Nov 08:45
3.5.0
Compare
Choose a tag to compare
  • Support for the Houndify API with recognizer_instance.recognize_houndify (thanks @tb0hdan!).
  • recognize_sphinx now supports keyword-based matching via the keywords=[("cat", 30), ("potato", 45)] parameter.
    • The second number in each pair is the sensitivity, which determines how loosely Sphinx will interpret speech to be those keywords - higher numbers mean more false positives, while lower numbers mean a lower detection rate.
    • A new example for keyword matching is now available.
  • BREAKING CHANGE: API.AI STT API IS BEING SHUT DOWN SOON. (source)
    • For now, the recognize_api function will keep working if you're on a paid API.AI plan, and we will not be removing it until the service is shut down entirely.
    • It is best to transition to another backend as soon as possible. I recommend Microsoft Bing Voice Recognition or Wit.ai for previous API.AI users.
  • phrase_time_limit option for listening functions, to limit phrase lengths to a certain number of seconds.
  • Support for operation timeouts with recognizer_instance.operation_timeout - this can be used to ensure long requests always take finite time.
  • recognize_ibm now opts out of request logging by default, for improved user privacy (thanks @michellemorales!). This is a breaking change if you previously relied on request logging behaviour.
  • Bugfix - listen() sometimes didn't terminate on finite-length streams.
  • Bugfix - Microsoft Bing Voice Recognition changed their authentication API endpoint, so that required some small code updates (thanks @tmator!).
  • Bugfix - 24-bit audio now works correctly on Python 2.
  • Update Wit.ai API version from deprecated version.
  • A bunch of documentation updates, fixes, and improvements.

Version 3.4.6

22 May 20:29
3.4.6
Compare
Choose a tag to compare

Bugfix release.

Changes:

  • api.ai now requires the sessionId field, so we'll just add that in (thanks @jhoelzl!).
  • Improve documentation a bit.
  • Various other small fixes.

Version 3.4.5

11 May 16:44
3.4.5
Compare
Choose a tag to compare

Changes:

  • Bug fix: non-24-bit audio wasn't converted properly to 16-bit audio on Python 2, due to the new 24-bit audio shim. Thanks to @jhoelzl for reporting!

Version 3.4.4

10 May 21:52
3.4.4
Compare
Choose a tag to compare

Maintenance release:

  • Python versions less than 3.4 don't support 24-bit audio properly. We now have pure-Python shims that will allow 24-bit audio to work on those old Python versions, though they will be somewhat slower. Thanks to @danse for reporting the issue!
  • Added updated Pocketsphinx binaries and Pocketsphinx installation procedures to match improvements on their end.
  • Fix Unicode file paths on Windows.
  • Fix caching in recognizer_instance.recognize_bing.
  • We now use the Manylinux Docker image for building FLAC. Hopefully, this will make building universal Linux binaries easier for packagers.

Version 3.4.3

09 Apr 23:59
3.4.3
Compare
Choose a tag to compare

Bugfix release:

  • Thanks to @jhoelzl, api.ai language support works again for non-English languages.

We're now GPG signing all our release tags. Under the releases page, you should see the following:

Signature screenshot

This tells you that GitHub thinks the Git tag is the same as the one we intended to release.

This key can also be found on the SKS keyservers, and you can import it with the following command:

gpg --keyserver x-hkp://pool.sks-keyservers.net --recv-keys 0x5F56B350

The packages on PyPI are signed as well - the signature can be downloaded under the "pgp" link on the SpeechRecognition PyPI page.

Version 3.4.2

04 Apr 00:18
Compare
Choose a tag to compare

Quick bugfix release on the tails of yesterday's big one:

  • Add support for the monotonic library on Python 2 - if you have monotonic installed in Python 2, recognize_bing will work faster!
    • On Python 3, recognize_bing already does the things that would make it fast, so the library is unnecessary.
  • Fix loading of non-16-bit AIFF files on Python 2.
  • Better document the Pocketsphinx language pack installation.

Version 3.4.1

03 Apr 10:41
Compare
Choose a tag to compare

Changes:

  • BREAKING CHANGE: AT&T STT API IS BEING SHUT DOWN SOON. (source)
    • For now, the recognize_att function will keep working, until the API itself is shut down.
    • It is best to transition over to IBM, Wit.ai, Google, CMU Sphinx, Bing Voice, or api.ai as soon as possible.
    • In most cases, you can simply rename recognize_att to a different service like recognize_ibm, then generate new API keys/tokens for it.
  • DEPRECATED CLASS: WavFile has been renamed to AudioFile.
    • WavFile will continue to work for the foreseeable future. New code should use AudioFile.
    • AudioFile is the same as WavFile, but in addition to WAV, it also supports AIFF and FLAC files!
  • New api.ai support, courtesy of @sbraden! See recognize_api in the library reference.
  • New Microsoft Bing Voice Recognition API support! See recognize_bing in the library reference.
  • Support for 8-bit unsigned WAV audio (thanks to @zhaoqf123 for reporting!).
  • Faster, upgraded FLAC binaries, with Linux binaries using Holy Build Box for maximum distro compatibility..
  • Updated setup process for Wit.ai.
  • Update phrase retrieval for recognize_ibm, courtesy of Bhavik Shah from IBM.
  • Documentation improvements and code cleanup.
  • Clearer licensing information - see the README.

As always, you can upgrade with pip install --upgrade speechrecognition.