Day In the Life of a vCon

To illustrate the normal operation of the conserver, let’s follow along as a conversation is extracted, transformed and the data is provided to a business team. For this example, we’ll assume the Conserver is started and configured to take conversations from a Freeswitch system, transcribe them, look for a particular subject (recalls) and send those to a PostGres table for the operations team.

A customer and an agent has a conversation using Freeswitch. A Freeswitch adapter is running that monitors calls and requesting recordings. For context, refer to https://developer.signalwire.com/compatibility-api/xml/ to see the kinds of call events and recording options.
When the call on Freeswitch ends, the adapter uses the data from the call (parties, recordings) to create a vCon. This vCon is then sent to the Conserver in a POST to the conserver's API, naming the REDIS lists that feed each conserver chain. Alternatively, the vCon could also be inserted into REDIS directly, then adding the vCon UUID to each chain's ingress list.
This vCon is stored in REDIS as a JSON object under the UUID of the newly created vCon. By convention, the key is named with the pattern “vcon:uuid” (like vcon:7665-343535-58575-333).
In addition to the standard parts of a vCon, the dialog and parties, the adapter adds a new attachment (to the attachments section of the vCon standard) that details what adapter created the vCon, details important for debugging, etc. This attachment travels inside the vCon throughout it’s life, unless it is explicitly stripped off later on.
Based on a periodic timer, or triggered by an external API call, the conserver iterates over all of the processing chains. Each chain has a REDIS list that contains the vCons to be processed. On each tick, the conserver creates a task for each ID that is read from the list. Horizontal scaling is enabled by having a single REDIS cluster connected to multiple conservers. Each task iterates the vCon over the series of links in the chain.
In this example chain, the first link is called “transcription”, and unsurprisingly, transcribes conversations. Links expect a vCon UUID as an input, and return vCon UUIDs as outputs. This allows configurations of chains of links, the output of one feeding the input of the next, freely interchangeable in order, or vendor.
The transcription link (currently there are two versions to choose from, Whisper.ai and Deepgram) take the dialog section of the vCon (which holds the recorded voice) and transcribe them. This transcription is added to the vCon in the “analysis” section, and normally contains information like a complete transcription, and a confidence score and a time stamp for every word transcribed. The link then updates the stored vCon with this new analysis, using REDIS to avoid reading or copying the large data objects in the dialog.
The conserver is responsible for the ordering and execution of each link in the chain. It is not a requirement that a link be used once; it may be repeated several times within and between chains.
The second link in the chain is called “recall finder”, and uses the output of the transcription link. When it is called by the conserver, it loads the transcription attachment and looks for the word “recall” in the conversation. If it does not find the word, it can simply exit without creating any message for the downstream plugin, effectively ending the processing of that chain of links.
At this point, the vCon has been created, captured, transcribed and identified as having the information we want: it’s a recall conversation. For information systems that want a native JSON representative, the vCon can now be sent for consumption. For instance, it could now be sent via a web hook (HTTP POST) to any API endpoint. In like manner, it can be stored in any of the storage blocks, current options include the Mongo Database, REDIS, S3, PostGres or a local file system.
If the final destination has a fixed schema, like a Postgres database, a Google Spreadsheet or a no code tool, we need to create a “projection” for this data before the “recall finder” is done. A projection is a simple key-value store of the important information, determined by the use case. For illustration, assume we are interested in sending the transcription, the identity of the agent and the customer, and when the conversation happened. This projection, which directly corresponds to a single row in a database, with four columns (transcription, agent, customer, created at), will be added to the vCon, just as the transcription analysis was. At this point, the original vCon now has an attachment from the adapter, an analysis by the transcriber, and this new transcription analysis.
The final link is a PostGres projection. When it runs, it looks for projections on a vCon, then takes that information and uses it to add (or upsert) a new row with the information from the projection into a configured PostGres table. From the perspective of the business users of the data, they simply see rows of transcribed conversations that match a criterion. Data projections, like adapters, handle the differences between destinations: unique data projections are required for different kinds of relational database, no code tools, Google Sheets, etc.

PreviousInside the Conserver NextAPI Documentation

Last updated 1 year ago