OpenAI and Java: Transcribe and Analyze Meeting Minutes

I have a new video available. It’s called Java-Powered OpenAI Tutorial: Transcribe & Analyze Meeting Minutes:

Transcribe and Analyze Meeting Minutes with OpenAI and Java

The overall idea is to reimplement the OpenAI tutorial found here in Java instead of Python. That allows me to work with Java parallel streams, text blocks, the HttpClient API, Java records, and more.

Here is a summary of the video in five bullet points, using the YouTube Summary plugin from Glasp, which works with ChatGPT (emphasis added):

  • The tutorial aims to transcribe an audio file into text using Java and then post-process the text with OpenAI’s GPT-4 model to create an automated meeting minutes generator. This is part of the “Tales from the Jar Side” series.
  • The OpenAI tutorial originally uses Python and has a limitation: the sample audio file exceeds the size limit allowed by the Whisper AI service for transcription. The Java implementation works around this by breaking the audio into smaller chunks, which has been previously discussed in another video.
  • The tutorial utilizes a map of prompts (Summary, Key Points, Action Items, and Sentiment Analysis) that are passed along with the transcribed text to GPT-4. These are processed sequentially, with each prompt generating its own response.
  • The responses are then saved into individual files for further analysis. This setup allows for future experimentation with parallel processing to potentially speed up the task.
  • Towards the end, there’s a note on the YouTuber’s goal to reach 1,000 subscribers and an overview of the code setup for transcribing the audio and interfacing with GPT-4. The video does not go into the process of converting the meeting minutes into a Microsoft Word document, considering it a “bridge too far.”

This summarizes the essence of the tutorial, touching upon the technical aspects as well as some of the limitations and goals.

That’s not a bad generated summary. The only part I would quibble with is that processing the GPT requests in parallel is maybe the best part of the experiment. Heck, the ability to easily use parallel streams might be enough reason for some people to use Java in the first place. Despite what the generated bullet point says, saving the responses into files has nothing to do with that. In fact, I deliberately moved the file-saving part out of the stream processing to ensure I wouldn’t have any concurrency issues with the file system.

Here’s the GitHub repository for all the code, which includes the GPT-4 analysis. It’s the same repository I’ve used for all the non-Spring-Framework videos in that playlist. The Spring-related ones are in this repository instead.

As a production detail, I should mention that the image on the thumbnail was produced using the DALL-E 3 tool inside

“Photo of a modern robot, wearing headphones, transcribing audio from a computer screen that displays the OpenAI logo. Beside the robot is a pile of meeting minutes and a coffee mug with ‘Java Developer’ written on it.”

The new version of DALL-E is a huge improvement over the previous one, to the point where I’m now debating getting rid of my Midjourney subscription. You’ll notice it even added text correctly, which is highly usual for image generation tools. What you don’t see, however, is that when I tried to iterate, it stubbornly clings to existing images even when it claims it changed them as requested. You really have to insist and iterate to get it to do what you want.

I hope you enjoy the video. I am getting closer to that 1000 subscriber mark, so if you’re willing to subscribe (it’s free after all), please consider doing so. 😊

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.