It's Story Time, a Raspberry Pi journey of ML, publishing and podcasting
Table of Contents
- TLDR;
- The goal
- What were the pieces
- What went wrong
- What went right
- What did I learn
- Where to from here
TLDR;
Go check out my Story Time Podcast, there’s a few stories generated a day using TinyStories and it’s the converted to a html page, uploaded to GitHub pages, after that it runs a process to create a mp3 using Piper and finally a rss and atom podcast feed.
It all runs on a Raspberry Pi 4. The same one serving shhh bot
Check out the website
Check out the podcast
Check out the code, it doesn’t include piper, but it’s the stock code.
Link to paper on TinyStories
Link to Piper Text to Speech engine.
Listen to a the stories here
Goal
Reading is a big part of our family. We all read, even those too young to read, will happily listen to their stories at bedtime, or any other time. Sometime when I’m a bit too tired, I sometimes dream that I could play them a podcast story instead. Our Google Home is able to tell stories whenever you ask it “Hey Google, tell me a story”, but eventually those run out (about 50 or so, that I’ve heard). Often, I wonder what could be done in this space.
With the new generation of machine learning is there a way to make endless stories ? Could they be unique and fun and have a moral grounding ? Could I run it all myself on a Raspberry Pi, hidden in a corner somewhere ?
What were the pieces
Hosting
I’ve always enjoyed using GitHub pages, this blog is hosted on it and it’s fun to see what you can achieve with a static front end and something more dynamic going on in the backend.
It’s a simple git push to get your website updated and GitHub will take care of the hosting, SSL and as a bonus it’s free.
The backend for GitHub pages is Jekyll and it uses Markdown which is a really simple text format to use for html pages. Putting a # and a space will give you a header for instance.
Mostly you would write these pages yourself then publish them when you have something new to share. I wanted some automation in this space and found some python snippets to write a valid markdown file.
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Create a new post')
parser.add_argument("--content", type=str, help="A filename for content for the post")
parser.add_argument("--date", type=str, help="Provide a date in format yyyy-mm-dd", default="2000-01-01")
parser.add_argument("--tags", type=str, help="Provide a comma separated list of tags", default="Story")
parser.add_argument("--title", type=str, help="Provide a title", default="default")
parser.add_argument("--author", type=str, help="Provide a name for the author of the post", default="Tony Mamacos")
args = parser.parse_args()
now = datetime.now()
datenow = str(datetime.today()).split()[0]
if (args.date == "2000-01-01"):
args.date = datenow
current_time = now.strftime("%H:%M:%S")
write_frontmatter(args, current_time)
I supply all the parameters on the command line and I’ll get a new post.
Writing a Story
TinyStories is an amazing new Large Language Model or LLM showing what is possible with a small model. It’s a LLM designed to make short stories, using a limited set of words and a small model size. The biggest version I could find was a 110 million parameter mode, normally these models are talked about in at least 10’s of billions for the ones that you would want to run locally, and 100’s of billions of parameters for ones you would want a larger company to be running on your behalf.
Pairing this with llama.cpp, we have a small model we can run on a Pi 4.
Here’s a sample story
Once upon a time, there was a gray cat. The cat loved to eat popcorn. One day, the cat went to the kitchen to find some popcorn.
The cat found a big bowl of popcorn on the table. The cat was very happy. It started to eat the popcorn very fast. The cat wanted to finish the popcorn before it was all gone.
But then, the cat ate too much popcorn. It started to feel sick. The cat ate so much popcorn that it could not move. In the end, the gray cat was very sad and sick because it ate too much popcorn.
Text to Speech
HomeAssitant is a powerfull home automation system and it has gone strength to strength this year. In my home we use it to control the lights, geyser, inverter and many more small tasks as they come up. This runs on a Raspberry Pi 3, I got before the drought. This year is the year of the voice and as part of that they integrates Whisper for speech to text and Piper for Text to Speech.
Using Piper on a Pi 4, you can get faster than realtime Text to Speech. I’m only using a basic version of it, so I supply the full text of what I want and it’ll convert it for me.
Podcast
Since Podcasts use a RSS feed for aggregation, I thought this would be simple, but actually it’s a mish mash of the RSS and Atom feeds and specific values have to be supplied in it, some for Apple, Google or Spotify.
I found a useful library python-feedgen to help along this process and generated a custom RSS and Atom feed for the website, just for podcasting. The website itself still has the RSS feed for the text versions of the stories.
I was pleasantly suprised that Apple was the clearest about what was missing and the first real integration I got working. Google was able to load my RSS feed without issue for my personal use, but wouldn’t load it into the directory. Spotify was the last to pass the test, once I had fixed the suggestions from Apple. As of a day of getting this all working on a weekend, Google and Spotify have not approved my podcasts yet for easy consumption. I am hopeful this will get a review early next week during normal business hours.
If you’re ever in a position wondering what is required in a Podcast RSS feed, take a look at the code, it seems to be working.
def createPodcastRSS():
fg = FeedGenerator()
fg.load_extension("podcast")
fg.title("AI Daily Short Story")
fg.podcast.itunes_category("Technology", "Podcasting")
fg.podcast.itunes_image("https://GitHub.com/tonym128/storytime/raw/main/ai_generated_stories_3k.png")
fg.podcast.itunes_author("Tony Mamacos")
fg.podcast.itunes_owner('Tony Mamacos', 'tmamacos@gmail.com')
fg.image("https://GitHub.com/tonym128/storytime/raw/main/ai_generated_stories.png")
fg.author({"name": "Tony Mamacos", "email": "tmamacos@gmail.com"})
fg.language("en")
fg.id("https://ttech.mamacos.media/storytime")
fg.link(href="https://ttech.mamacos.media/storytime", rel="self")
fg.description("A daily pod cast of an AI generated short story")
Linking it together
I have a SystemD service setup to run a bash script, which will generate a new story a few times a day.
Once it’s generated the story and created a new post, there is a second process, which will grab the text for all the stories, put it into files in the Piper runtime directory and create text to speech wave files for any text file which doesn’t have a corresponding wave file. After this, it runs another process to convert the generate a Mp3 of the file and puts it into the StoryTime repository.
The third process will recreate the PodCast RSS and Atom file with the correct fields, and finally it updates the repository with the new story post, podcast feeds and audio files, which it uploads.
What went wrong
I really did struggle with the Podcast feed generation and making Podcast files that applications would accept. Finding a library to help me generate the files was a big help and having Apple tell me what was missing was super helpful. This isn’t something that people would normally encounter when using a specific podcast service as they would upload their audio files and the Podcast feed is automatically generated by the underlying service provider and even potentially propogated to all the different Podcast feed services.
I believe I’m waiting on human validation of the source material for Google and Spotify, but hopefully this will pass muster on their sites.
Another issue I had was with linking the audio assets. I thought I would be able to do this from my domain and GitHub pages, but I struggled with that, so I have linked them directly to the GitHub repository as well as the Podcast Feed, which has worked without issue so far, I imagine for the Podcast only, I could just as easily host it from a S3 Storage bucket with HTTP access enabled and my application could upload it directly there.
What went right
Story generation with TinyStories was super fun, I have read hundreds of them myself and while I don’t prompt for the website, I did experiment for a bit with sending through the first line of text and having TinyStories try to work with it. It was interesting to see how it dealt with difficult concepts or words that it didn’t have in it’s vocabulary.
Text to speech was blisfull, Piper worked like a dream and I even stuck with the default voice, which has quite an interesting accent, I am hoping people will feel relaxed listening to a minute long story which sometimes gets really close to hitting the mark.
The GitHub pages took a bit of fiddling to automate, but overall went really easily and I was glad for having a website up and running really quickly.
What did I learn
The TinyStories LLM was fun to play with and see it running on a Pi 4 in under a minute for a story. The text to speech engine, Piper, also runs just as fast, so all out this could produce a massive amount of stories a day, but I think the value 2 to 4 a day, make it’s into something I will be happy to have in my Podcast feed.
Hosting a podcast, was an interesting experience, it was all new to me. Seeing all the service providers for putting up a Podcast and their fees was something I wasn’t aware of, having the host the files yourself was something I imagined, but all this infrastructure and paid services for the privledge was something I was unware of.
Piper is already a big part of my HomeAssitant setup and it was great to get it running locally and independantly for my own project.
Where to from here
I could tweak the pages some more, but I am pretty happy with the outcome. I had imagined that having a custom story generator and feeds for specific people, childrens names, themese for story would be amazing, my experimenting with TinyStories showed me it has a relatively small subset of names and the stories somehow quite often fall flat by the end even when they have a lot of promise. Thematically, there’s a very consistant theme drive, which can make the stories a little too predicatable, though sometimes, it comes up with a story that very pleasantly surprises me. It is a very small model, so I have no issues with the problems presented, but expanding the service might need a bigger model.
I hope you enjoyed this blog and have some fun listening to the stories, maybe even subscribe to the Podcast. I don’t see myself shutting it down anytime soon, so hopefully there’ll be a few hundred stories there forever.