Google Media Translation API не показывает результат

ProgramBox

Google Media Translation API не показывает результат

Post author:admin
Запись опубликована:24 октября, 2022
Post category:Вопросы по программированию

#python #api #web-services #google-translate #google-translation-api

# #python #API #веб-сервисы #google-translate #google-translation-api

Вопрос:

Я новичок в Google API и веб-сервисах. Я пробовал GoogleTransateAPI только один раз, но он работает нормально. Теперь я хочу использовать Google Media Translation API для перевода голосового ввода. Я следовал руководству из https://cloud.google.com/translate/media/docs/streaming .

Однако я не могу заставить его работать. Во время выполнения ошибки нет, поэтому я не знаю, где искать. Не могли бы вы помочь мне определить проблему?

 # [START media_translation_translate_from_mic]
from __future__ import division

import itertools

from google.cloud import mediatranslation as media
import pyaudio
from six.moves import queue
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="/Users/Me/GoogleMT/TranslationAPI/MediaKey.json"


# Audio recording parametersss
RATE = 16000
CHUNK = int(RATE / 10)  # 100ms
SpeechEventType = media.StreamingTranslateSpeechResponse.SpeechEventType


class MicrophoneStream:
    """Opens a recording stream as a generator yielding the audio chunks."""

    def __init__(self, rate, chunk):
        self._rate = rate
        self._chunk = chunk

        # Create a thread-safe buffer of audio data
        self._buff = queue.Queue()
        self.closed = True

    def __enter__(self):
        self._audio_interface = pyaudio.PyAudio()
        self._audio_stream = self._audio_interface.open(
            format=pyaudio.paInt16,
            channels=1, rate=self._rate,
            input=True, frames_per_buffer=self._chunk,
            # Run the audio stream asynchronously to fill the buffer object.
            # This is necessary so that the input device's buffer doesn't
            # overflow while the calling thread makes network requests, etc.
            stream_callback=self._fill_buffer,
        )

        self.closed = False

        return self

    def __exit__(self, type=None, value=None, traceback=None):
        self._audio_stream.stop_stream()
        self._audio_stream.close()
        self.closed = True
        # Signal the generator to terminate so that the client's
        # streaming_recognize method will not block the process termination.
        self._buff.put(None)
        self._audio_interface.terminate()

    def _fill_buffer(self, in_data, frame_count, time_info, status_flags):
        """Continuously collect data from the audio stream, into the buffer."""
        self._buff.put(in_data)
        return None, pyaudio.paContinue

    def exit(self):
        self.__exit__()

    def generator(self):
        while not self.closed:
            # Use a blocking get() to ensure there's at least one chunk of
            # data, and stop iteration if the chunk is None, indicating the
            # end of the audio stream.
            chunk = self._buff.get()
            if chunk is None:
                return
            data = [chunk]

            # Now consume whatever other data's still buffered.
            while True:
                try:
                    chunk = self._buff.get(block=False)
                    if chunk is None:
                        return
                    data.append(chunk)
                except queue.Empty:
                    break

            yield b''.join(data)


def listen_print_loop(responses):
    """Iterates through server responses and prints them.
    The responses passed is a generator that will block until a response
    is provided by the server.
    """
    translation = ''
    source = ''
    for response in responses:
        # Once the transcription settles, the response contains the
        # END_OF_SINGLE_UTTERANCE event.
        if (response.speech_event_type ==
                SpeechEventType.END_OF_SINGLE_UTTERANCE):

            print(u'nFinal translation: {0}'.format(translation))
            print(u'Final recognition result: {0}'.format(source))
            return 0

        result = response.result
        translation = result.text_translation_result.translation
        source = result.recognition_result

        print(u'nPartial translation: {0}'.format(translation))
        print(u'Partial recognition result: {0}'.format(source))


def do_translation_loop():
    print('Begin speaking...')

    client = media.SpeechTranslationServiceClient()

    speech_config = media.TranslateSpeechConfig(
        audio_encoding='linear16',
        source_language_code='en-US',
        target_language_code='ja')

    config = media.StreamingTranslateSpeechConfig(
        audio_config=speech_config, single_utterance=True)

    # The first request contains the configuration.
    # Note that audio_content is explicitly set to None.
    first_request = media.StreamingTranslateSpeechRequest(
        streaming_config=config, audio_content=None)

    with MicrophoneStream(RATE, CHUNK) as stream:
        audio_generator = stream.generator()
        mic_requests = (media.StreamingTranslateSpeechRequest(
            audio_content=content,
            streaming_config=config)
            for content in audio_generator)

        requests = itertools.chain(iter([first_request]), mic_requests)

        responses = client.streaming_translate_speech(requests)

        # Print the translation responses as they arrive
        result = listen_print_loop(responses)
        if result == 0:
            stream.exit()


def main():
    while True:
        print()
        option = input('Press any key to translate or 'q' to quit: ')

        if option.lower() == 'q':
            break

        do_translation_loop()


if __name__ == '__main__':
    main()
# [END media_translation_translate_from_mic]

Результат такой. Нет ни перевода, ни результата распознавания.

Скриншот результата

Я не был уверен, что проблема в моем микрофоне, поэтому я попробовал аналогичный пример кода из другого руководства Google для перевода аудиофайла. Результат тот же, ни результата распознавания, ни перевода.

Я что-то пропустил?

Большое вам спасибо.

1. Можете ли вы попробовать настроить скорость до 8000, обычно микрофонные входы имеют частоту 8000 Гц. Если это все еще не сработало, вы можете попробовать изменить СКОРОСТЬ / 8, чтобы достичь 100 мс на блок.

2. Спасибо за ваш комментарий, Рикко Д. Я попытался установить RATE = 8000 и RATE / 8, но результат все тот же.

3. можете ли вы попробовать изменить audio_encoding=’mulaw’, поскольку эта кодировка также используется микрофонами. Но если это все еще не работает, вы можете попробовать изменить audio_encoding . Вы можете проверить это для поддерживаемого audio_encoding cloud.google.com/translate/media/docs/reference/rpc /…

Вопрос:

Комментарии:

Вам также может понравиться

WooCommerce получает все заказы в foreach

Что происходит с данными в приложении iOS при обновлении новой версии приложения и обновлении новой версии iOS

Извлечение Значений Из Необработанного JSON С Помощью Powershell