Create your subtitles with Amazon Transcribe, AWS Lambda, Amazon S3 and Amazon SNS.
Machine learning is now used daily for several purposes, one of these is the understanding and the conversion of speech into text. Amazon Transcribe is a machine learning-based service that allows, through deep learning process called Automatic Speech Recognition (ASR) and Natural Language Processing (NLP,) to transcribe audio files. Amazon Transcribe is able to recognize the different speakers and the spoken language from over 30 different languages including Arabic, Japanese, Russian and Chinese, supporting different formats such as AMR, FLAC, MP3, MP4, Ogg, WebM, WAV to automate media closed captioning & subtitling.
Today we will see how to use this service to create subtitles for our favorite videos. In order to create our architecture we will use other services offered by AWS. We will use Amazon S3 to upload our audio/video and to store the generated subtitle files, AWS Lambda to start the conversion via Amazon Transcribe and Amazon EventBridge rule to receive an email via Amazon SNS when the conversion job is finished.
The first thing to do is to create two S3 buckets, one for uploading the files to be transcribed and one where Amazon Transcribe will store the related transcript files (you could only create one bucket but I like to keep the files in order). Remember that the names of S3 buckets are unique globally, so choose a name that is not already used. We can proceed with the creation through the AWS console
or from the terminal using the commands
and then
in order to block public access.
Now let’s create the Lambda function that will create the transcription job in Amazon Transcribe as soon as a new file is uploaded to the previously created input bucket. Before doing this we need to create a policy to attach to the role that will guarantee the function access to Amazon Transcribe and S3 buckets. Open the console in the Identity and Access Management (IAM) service, select Roles and then ‘Create role’.
As ‘Trusted entity type’ select AWS service and as ‘Use case’ Lambda. Add the ‘AmazonTranscribeFullAccess’ permission policy and create a custom policy in which to paste the following code:
to ensure access to the previously created output bucket.
Now go to the Lambda service in the AWS console and select ‘Create function’. Select ‘Author from scratch’, give a name to the function, select ‘Python’ as runtime, select the role previously created and then ‘Create function’.
Paste the following code and click on ‘Deploy’
Now let’s add the trigger that will activate the Lambda function, in our case when a file is loaded into the input bucket.
Now let’s create the SNS topic that will allow us to receive an email when the transcription job is finished. In the AWS console, open Amazon SNS and select ‘Create topic’. Select ‘Standard’ as the type, name the topic and select ‘Create topic’. Once the topic has been created add a subscription by selecting ‘Create subscription’, in Topic ARN we select the ARN of the newly created topic, as protocol choose ‘Email’, enter the email on which you want to receive notifications in Endpoint and select ‘Create subscription’. You will receive an email to confirm the subscription to the topic.
Well, everything is almost ready, only the SNS topic trigger is missing. In the AWS console select Amazon EventBridge and then Rules, choose ‘Create rule’, give the rule a name, make sure that ‘Rule with an event pattern’ is selected and continue to the next page, in Event pattern paste the following code
continue to the next page, select SNS topic as target and choose the previously created topic, and continue until create the rule.
This way, every time we upload a file of those supported by Amazon Transcribe into our S3 bucket, it will trigger the Lambda function that will create a transcription job which will save the results, in this case an .srt file and the .json file which contains all the audio transcription and the transcription informations, in the output bucket and we will receive an email that will notify us when the transcription job is finished.
This is just an example of how Amazon Transcribe can be used which is also able to perform transcriptions in real time, is able to create call analytics jobs and can be used for medical transcription such as for clinical documentation, call analytics in pharmacovigilance, and accessibility during telehealth sessions.