Generative Content Skills
You may wish to create a Base Corpus where the content is generated programmatically. This may be done using a machine learning model like GPT, or could be done using a more traditional method of scripted code.
This approach enables you to have full control over the creation of a Digital Person's response to each user input, and to use whichever programmatic tools are most suitable for your use case.
Streaming with Skills Async API
The Skills Async API can be useful when implementing a generative content skill. A common scenario can be using a LLM (Large Language Model) or integrating with an existing generative API; these models and APIs often support streaming, i.e. they generate a stream of tokens. The async API allows you to stream messages as they are generated, without having to wait until the generation of the full response has finished, thus improving latency and responsiveness, resulting in a better user experience.
Here we explore an example that illustrates the basics of streaming, using the Skills Async API.
Getting started You can download or clone a Skills Async API template app from GitHub:
The code that is walked through in this document can be applied to the template.
Streaming API mock
Let's start by having a mock implementation of a generative API, to simulate streaming of chunks.
- Python
- Nodejs
import random
from time import sleep
def response_generator(input_text):
"""
Generates a response based on the input and gradually streams it
in chunks of varying sizes.
"""
response = make_response(input_text)
while len(response) > 0:
chunk, response = get_chunk(response)
is_final = len(response) == 0
yield chunk, is_final
if not is_final:
sleep(0.1) # sleep between each chunk to simulate latency
def make_response(input_text):
"""
Respond by copying the input multiple times, concatenating as separate sentences.
"""
response = ""
# repeat the input sentence a random number of times
times_to_repeat = random.randint(2, 4)
response = ". ".join([input_text] * times_to_repeat)
return response
def get_chunk(text):
"""
Take out a chunk of a random size from the given text and return
both the chunk and the remaining text.
"""
chunk_size = random.randint(3, 5) # generating a random chunk size between 3 and 5
chunk = text[:chunk_size]
text = text[chunk_size:]
return chunk, text # returning both chunk and the updated text
function getRandomInt(min: number, max: number): number {
return Math.floor(Math.random() * (max - min + 1) + min);
}
function delay(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
// Generates a response based on the input and gradually streams it in chunks of varying sizes.
async function* responseGenerator(
inputText: string,
): AsyncIterableIterator<[string, boolean]> {
let response = makeResponse(inputText);
while (response.length > 0) {
let [chunk, remaining] = getChunk(response);
response = remaining;
let isFinal = response.length === 0;
yield [chunk, isFinal];
if (!isFinal) {
await delay(100); // sleep for 100ms to simulate latency
}
}
}
function makeResponse(inputText: string): string {
let timesToRepeat = getRandomInt(2, 4); // generate repetition count randomly between 2 and 4
return Array(timesToRepeat).fill(inputText).join('. ');
}
function getChunk(text: string): [string, string] {
let chunkSize = getRandomInt(3, 5); // determine a random chunk size between 3 and 5
let chunk = text.substring(0, chunkSize);
let remainingText = text.substring(chunkSize);
return [chunk, remainingText]; // return both the chunk and the remaining text
}
Generate sentences from chunks
We want to respond with coherent sentences, rather than with individual chunks of the response, so we put the chunks together into sentences:
- Python
- Nodejs
import re
class ResponseData:
def __init__(self, text: str, is_final: bool = True):
self.text = text
self.is_final = is_final
def sentence_generator(response):
text = ""
for chunk, is_final in response:
text += chunk
sentences = re.split('(?<=[^\d][.!?])\s+', text)
for sentence in sentences[:-1]:
yield ResponseData(text=sentence.strip(), is_final=False)
text = sentences[-1] if sentences else ""
if is_final:
yield ResponseData(text=text.strip(), is_final=True)
class ResponseData {
text: string;
isFinal: boolean;
constructor(text: string, isFinal: boolean = true) {
this.text = text;
this.isFinal = isFinal;
}
}
async function* sentenceGenerator(
response: AsyncIterableIterator<[string, boolean]>,
): AsyncIterableIterator<ResponseData> {
let text = '';
for await (let [chunk, isFinal] of response) {
text += chunk;
let sentences = text.split(/(?<=[^\d][.!?])\s+/);
for (const sentence of sentences.slice(0, -1)) {
yield new ResponseData(sentence.trim(), false);
}
text = sentences.length ? sentences[sentences.length - 1] : '';
if (isFinal) {
yield new ResponseData(text.trim(), true);
}
}
}
Stream the sentences
Now that we can get sentences from our mock generative API, we can stream those responses via the Async API:
- Python
- Nodejs
import uuid
from smskillsdk.models.api_async import UserConversationMessage, SkillConversationMessage
from utils.ws_connection import Connection
def conversation_handler(connection: Connection, msg: UserConversationMessage):
input = msg.text
if msg.variables:
turn_id = msg.variables.get("Turn_Id", None)
if not turn_id:
turn_id = str(uuid.uuid4())
response = response_generator(input)
sentences = sentence_generator(response)
for sentence in sentences:
vars = {
"Turn_Id": turn_id,
"isFinalSentence": sentence.is_final
}
connection.send(SkillConversationMessage(text=sentence.text, variables=vars))
import {
UserConversationMessage,
SkillConversationMessage,
} from '@soulmachines/smskillsdk';
import { Connection } from 'utils/ws_connection';
import { v4 as uuidv4 } from 'uuid';
export async function conversationHandler(
connection: Connection,
msg: UserConversationMessage,
) {
console.log('Conversation handler got:', msg);
const input = msg.text;
const vars: Record<string, any> = msg.variables;
let turnId = vars?.Turn_Id;
if (!turnId) {
turnId = uuidv4();
}
const response = responseGenerator(input);
const sentences = sentenceGenerator(response);
for await (const sentence of sentences) {
const variables = {
Turn_Id: turnId,
isFinalSentence: sentence.isFinal,
};
const convResponse: SkillConversationMessage = {
text: sentence.text,
variables: variables,
};
await connection.send('skillConversation', convResponse);
}
}
Note that we add the Turn_Id
and isFinalSentence
variables. This information helps the Digital Person know that these messages correspond to the same turn, and also inform when the message is finalized, which is important to allow early cancelling of responses, as well as facilitating better handling of the streamed messages.