Voice, calls, and video calls
We provide the option for a few default voices per gender, plus the option to create your own voice via voice samples for subscribers. We recommend using Elevenlabs, our audio partner, to generate AI voices. You can have only 1 custom voice per Kindroid and to create a new custom voice, you must delete your old one first.
Creating custom voices first requires audio samples, and you must own the rights to the samples you upload. Quality matters much more than quantity - just a minute or so of high-quality audio will be sufficient, and more than 2 minutes is not necessary. Ensure that the samples show good degree of variance, as the process will capture the variance in tone and style in the samples. You can use custom accents or in foreign languages - all of those traits will be captured in the custom voice. Sample quality is the most important thing - err on the side of a few high-quality samples than many mediocre ones.
Once you have the samples, you can finetune the voice with sliders. You should experiment on your own, but generally we find the default to be acceptable for most cases.
You can click the play button to hear audio. Note that this can only be run once per message unless it is regenerated. Words within (parentheses) will not be spoken aloud intentionally, so if you prefer actions to not be spoken out loud, use (parentheses) to denote them. All other formatting such as *asterisks* will be spoken aloud.
Technical Note: The statement about words in (parentheses) does not apply to voice or video calls.
In general settings - > account wide, you can turn on autoplay audio for messages that you receive. This applies to single chats as well as for groupchats.
Voice calls can be conducted in many languages, though currently for the highest intelligence, we recommend using English. All audio (both microphone input in as well as audio output) and video are processed ephemerally and aren't stored.
Technical Note: Multi-paragraph mode is disabled, which means that any new paragraphs or lines will cause your Kindroid to stop speaking.
Voice call uses the same backstory, key memories, and can recall from long term memory and journals just like text chat. In voice call settings (gear icon on top right), there is the unified chat/voice chat history toggle that affects how memory works in voice calls.
If unified chat/voice chat history is enabled, the voice call will share the identical chat history as the text chat. This makes it so you can switch back and forth, and is useful if you see voice call as a continuation of text chat and vice versa rather than a separate mode. When you return to text chat, your Kindroid will be able to reference what occurred latest in the voice call and you can continue in text chat (though voice call messages will not show up in text chat message bubbles). Shared memory in groupchats will work the same way as they do in text chat, if both shared memory in a group is enabled and this toggle is enabled.
If unified context is disabled, voice call will be treated as a completely separate instance. Voice call will default to a blank slate chat history and will not recall any context from text chat. There is a temporary voice call memory that keeps record of the call transcript; in the event the call is dropped, or you press end call and restart it (without going to text chat), you can resume the call and pick up where you left off. The temporary call history is reset if you engage in text chat in any way or do a chat break.
Voice call does consolidate into long term memory (granted it's not disabled on a Kindroid level) regardless of whether unified chat/voice chat history is enabled. Long term memory is different from chat history/short term memory. Contents from the voice call may be recalled in text chat when the context for recall is similar, but may need specific prompting to refer to that memory. Your voice messages also recall journal entries. For more details on memory and specifics, see Memory.
You can turn on video in the bottom left corner and drag your video feed on the screen. Your Kindroid will then be able to see, but be aware that due to processing load to ensure that anything you show stays on the screen for some time, and to give your Kindroid enough time to process what it sees before ending your turn.
If you're on a desktop browser, you can also use the screen share function to share your computer screen. This is not available on mobile phones/apps due to operating system level restrictions currently. Screen share takes on the aspect ratio of your current viewport/shared window, and deviate from video call's square aspect.
Click on the CC icon in voice calls to toggle transcripts. Transcripts will only persist on the voice call session while you're on the page, and will reset if you go to some other page or screen. Starting/hanging up calls will not reset transcripts to help persist them through accidental call drops or errors.
During the AI's turn, you can interrupt and talk again if you press the central microphone/speaker button. You can only do this during the AI's turn, and they will still have what they are supposed to speak in the call transcript as if they finished speaking.
For calls, you can also use text input if you don't wish to speak while having your Kindroid speak back at you on the bottom left corner. Text input is only available when press to speak is on, and when the microphone/press to speak is idle.