Skip to content

feat!: text-to-speech x LLM integration#936

Open
IgorSwat wants to merge 3 commits intomainfrom
@is/llm-to-speech
Open

feat!: text-to-speech x LLM integration#936
IgorSwat wants to merge 3 commits intomainfrom
@is/llm-to-speech

Conversation

@IgorSwat
Copy link
Contributor

@IgorSwat IgorSwat commented Mar 4, 2026

Description

This pull request introduces a few changes to the Text-to-Speech module:

  • Improved streaming mode by allowing an incrementally expanded text input. This change focuses on integrating T2S with text generation models (e.g. Llama 3.2).
  • Added simple test cases for T2S module.

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

To test the Text-to-Speech module, run the set of tests for this module.
To test the new streaming mode and it's integration with text generation models, one can use 'text-to-speech-llm' demo app.

Screenshots

Related issues

#773
#897

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

@IgorSwat IgorSwat changed the title @is/llm to speech feat: text-to-speech x LLM integration & text-to-speech tests Mar 4, 2026
@IgorSwat IgorSwat changed the title feat: text-to-speech x LLM integration & text-to-speech tests feat: text-to-speech x LLM integration Mar 4, 2026
@IgorSwat IgorSwat added test Issue and PR related to tests or testing infrastructure feature PRs that implement a new feature labels Mar 5, 2026
await this.nativeModule.stream(
speed,
stopAutomatically,
(audio: number[]) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should return JSTensorViewOut from the native side, so there's less copying going on

@msluszniak msluszniak changed the title feat: text-to-speech x LLM integration feat!: text-to-speech x LLM integration Mar 10, 2026
@IgorSwat IgorSwat force-pushed the @is/llm-to-speech branch 2 times, most recently from 846b177 to 49b5656 Compare March 11, 2026 13:44
@msluszniak msluszniak requested review from chmjkb and msluszniak March 11, 2026 14:29
@msluszniak
Copy link
Member

Get errors in tests:

ld.lld: error: undefined symbol: phonemis::Pipeline::Pipeline(phonemis::phonemizer::Lang, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)
>>> referenced by Kokoro.cpp:21 (/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp:21)
>>>               CMakeFiles/TextToSpeechTests.dir/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp.o:(rnexecutorch::models::text_to_speech::kokoro::Kokoro::Kokoro(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::shared_ptr<facebook::react::CallInvoker>))

ld.lld: error: undefined symbol: phonemis::Pipeline::process(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)
>>> referenced by Kokoro.cpp:166 (/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp:166)
>>>               CMakeFiles/TextToSpeechTests.dir/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp.o:(rnexecutorch::models::text_to_speech::kokoro::Kokoro::generate(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>>, float))
>>> referenced by Kokoro.cpp:225 (/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp:225)
>>>               CMakeFiles/TextToSpeechTests.dir/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp.o:(rnexecutorch::models::text_to_speech::kokoro::Kokoro::stream(float, bool, std::__ndk1::shared_ptr<facebook::jsi::Function>))
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [TextToSpeechTests] Error 1
make[1]: *** [CMakeFiles/TextToSpeechTests.dir/all] Error 2
make: *** [all] Error 2

@msluszniak
Copy link
Member

msluszniak commented Mar 11, 2026

All the stories generated in the example app are almost identical, do we want it this way?

@IgorSwat
Copy link
Contributor Author

IgorSwat commented Mar 11, 2026

All the stories generated in the example app are almost identical, do we want it this way?

Well, to be honest, I was just creating those sample apps as a simple to use testing tools, since it's quite difficult to write proper tests for text-to-speech.

We could remove this particular screen if we are sure that everything works fine.

Get errors in tests:

ld.lld: error: undefined symbol: phonemis::Pipeline::Pipeline(phonemis::phonemizer::Lang, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)
>>> referenced by Kokoro.cpp:21 (/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp:21)
>>>               CMakeFiles/TextToSpeechTests.dir/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp.o:(rnexecutorch::models::text_to_speech::kokoro::Kokoro::Kokoro(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&, std::__ndk1::shared_ptr<facebook::react::CallInvoker>))

ld.lld: error: undefined symbol: phonemis::Pipeline::process(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>> const&)
>>> referenced by Kokoro.cpp:166 (/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp:166)
>>>               CMakeFiles/TextToSpeechTests.dir/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp.o:(rnexecutorch::models::text_to_speech::kokoro::Kokoro::generate(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char>>, float))
>>> referenced by Kokoro.cpp:225 (/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp:225)
>>>               CMakeFiles/TextToSpeechTests.dir/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/models/text_to_speech/kokoro/Kokoro.cpp.o:(rnexecutorch::models::text_to_speech::kokoro::Kokoro::stream(float, bool, std::__ndk1::shared_ptr<facebook::jsi::Function>))
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
make[2]: *** [TextToSpeechTests] Error 1
make[1]: *** [CMakeFiles/TextToSpeechTests.dir/all] Error 2
make: *** [all] Error 2

Fixed. I forgot to change a no longer existing CMake variable.

@msluszniak
Copy link
Member

Well, to be honest, I was just creating those sample apps as a simple to use testing tools, since it's quite difficult to write proper tests for text-to-speech.

We could remove this particular screen if we are sure that everything works fine.

Let's keep it, I think it's better to keep it, even if very simple

@IgorSwat IgorSwat force-pushed the @is/llm-to-speech branch from 2837d23 to f623cd6 Compare March 11, 2026 21:10
@msluszniak msluszniak self-requested a review March 12, 2026 09:52
@msluszniak
Copy link
Member

FYI, I pushed one change:

// CommonModelTest is not instantiated in this translation unit
namespace rnexecutorch::model_tests {
GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(CommonModelTest);
} // namespace rnexecutorch::model_tests

This is because otherwise one test failed:

/Users/msluszniak/test_factory/react-native-executorch/packages/react-native-executorch/common/rnexecutorch/tests/integration/BaseModelTests.h:121: Failure
Type parameterized test suite CommonModelTest is defined via REGISTER_TYPED_TEST_SUITE_P, but never instantiated via INSTANTIATE_TYPED_TEST_SUITE_P. None of the test cases will run.

Ideally, TYPED_TEST_P definitions should only ever be included as part of binaries that intend to use them. (As opposed to, for example, being placed in a library that may be linked in to get other utilities.)

To suppress this error for this test suite, insert the following line (in a non-header) in the namespace it is defined in:

GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(CommonModelTest);

[  FAILED  ] GoogleTestVerification.UninstantiatedTypeParameterizedTestSuite<CommonModelTest> (0 ms)
[----------] 1 test from GoogleTestVerification (0 ms total)

[----------] Global test environment tear-down
[==========] 6 tests from 3 test suites ran. (12590 ms total)
[  PASSED  ] 5 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] GoogleTestVerification.UninstantiatedTypeParameterizedTestSuite<CommonModelTest>

and the advanced.md stated that:

Additionally, by default, every `TEST_P` without a corresponding
`INSTANTIATE_TEST_SUITE_P` causes a failing test in test suite
`GoogleTestVerification`. If you have a test suite where that omission is not an
error, for example it is in a library that may be linked in for other reasons or
where the list of test cases is dynamic and may be empty, then this check can be
suppressed by tagging the test suite: [...]

And Claude confirms this:

The Kokoro test file includes BaseModelTests.h (which defines CommonModelTest) but never instantiates it. Since gtest requires the suppressor to be in a non-header, add it to the Kokoro test file:


#include "BaseModelTests.h"
#include "utils/TestUtils.h"
// ... other includes

// CommonModelTest is defined in BaseModelTests.h but not instantiated here
// (Kokoro tests use KokoroTest fixture directly instead)
namespace rnexecutorch::model_tests {
GTEST_ALLOW_UNINSTANTIATED_PARAMETERIZED_TEST(CommonModelTest);
} // namespace rnexecutorch::model_tests
Add those lines right after the #include block, before the anonymous namespace. This tells gtest the omission is intentional for this translation unit.

@IgorSwat IgorSwat force-pushed the @is/llm-to-speech branch from 8258808 to 5314404 Compare March 13, 2026 15:07
@IgorSwat IgorSwat force-pushed the @is/llm-to-speech branch from 5314404 to 5bfed1c Compare March 13, 2026 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature PRs that implement a new feature test Issue and PR related to tests or testing infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add tests for Text to Speech module Integrate Text-to-Speech with LLMs

3 participants