The World Cup is upon us and if you’re lucky, you’ve got Kwesé iflix on your mobile device, ready to stream every single match.
Live streaming is a lot like having a baby. Yeah, I said it. And if you’ll allow me, I can explain.
Recently I asked Belal, iflix’s amazing software engineer, to write an article for this blog to explain how live streaming works. You see, iflix has had tremendous success with live streaming, especially with the broadcasts of the Indonesian Liga 1, the Mayweather vs McGregor fight, the BTS concert, T10 and T20 cricket, and most recently, with Football Malaysia on iflix. Considering that live streaming was, in our case, record-breakingly popular, I figured a deep dive into the tech would make for a wonderful “behind the scenes” look into the company’s operations.
Accommodating and enthusiastic, Belal agreed and delivered a blog post which, when it came across my desk, looked like this.
I read the first paragraph. Didn’t get it. I read the second paragraph. Definitely didn’t get that. I skimmed the third paragraph and realised I was getting nowhere, and I gave up. Technical articles proved…too technical. Belal would have to explain.
So I sat down with Belal to discuss live streaming, hoping that he’d be able to teach me a thing or two, and then I would make an attempt at demystifying live streaming for the masses. That is how, after a 4 hour discussion, I came to the conclusion that live streaming is a lot like having a baby: it’s complicated, painful, and a thousand little things have to come together to create the miracle of life/live.
Let’s say an iflix user wanted to live stream a football game happening in X, when he was located in Y. Let’s also assume those two places are geographically really far from each other.
Unfortunately, it’s not as simple as: Camera captures action in X, and by some miracle of the internet, it magically makes its way to a mobile device of an iflix user in Y. There are approximately 56* other steps in between.
*56 may be a slight exaggeration, but let’s not allow facts get in the way of a great story.
Here’s some background info:
Sports cameras shoot in HD. Let’s call that the source. That digital footage, over a 90-minute football game, makes for a HUGE amount of data. No bueno for anything you want to move across the world in a matter of seconds. That’s why all footage has to be encoded. As Belal explained this, I nodded. That’s not so hard. I can understand that.
When it’s encoded, the footage makes its way to MediaLive, an enormous server controlled by Amazon Web Services, that lives in the cloud by way of the internet. I was still following, but already my mind was wandering – how did the encoded footage get to MediaLive?
Turns out, it’s through something called protocols. Internet niceties, if you will. There are apparently four of them that iflix uses and Belal and I went into great detail about how each of the four types worked. Information overload. I panicked. I suppose the only thing important to know for our purposes is that the source determines what protocol is used to get the encoded footage to MediaLive, you follow? Good.
Let’s assume all goes well and the encoded footage makes it to MediaLive via one of the 4 protocols successfully. This is when transcoding happens. Pause for keyword explanation. Transcoding is a fancy way of saying changing huge video files to less-HD versions to cater to slower internet connections, typically on smaller (read: mobile) screens.
Once again, Belal and I went into the nitty gritty of how transcoding happens, but my brain was not able to process the intricate details, so we moved on by assuming transcoding works without a hitch. Phew! iflix apparently creates four profiles by transcoding the encoded footage. They come in 240, 360, 480, and 720 resolutions, to accommodate various network speeds.
Once transcoded, these four profiles are packaged, using the aptly named MediaPackage (also part of Amazon Web Services) and sent. Where you might ask? Well, let’s step back for a second and consider how this looks on an iflix user’s phone.
Belal explained: “The Player on the iflix app gets the manifest and orchestrates which version of the feed the Player should pull…” WHOA WHOA WHOA. Hold up. Manifest? Feed? Pulling? What? Belal laughed. I think he secretly enjoyed my panic.
Belal slowed down. “Let’s say Hussein clicks PLAY.” He used Hussein in his example, because in that very instant, Hussein, iflix’s Ad Tech – Product Manager, joined us, presumably to check why I was turning blue in the face. If we’re going with that having-a-baby analogy, I was feeling contractions. The pain was real. We were 4 hours into labour. “In order for the video to play, he needs a manifest, which he gets from the CDN…” If by now you’re sitting there thinking, what the eff is a CDN? And a manifest? I’m glad you asked because I’m clearly not the only one who didn’t know.
A CDN is a Content Delivery Network. It holds the videos files that were packaged by MediaPackage. A CDN was not in X. It was closer to Y, closer to the end-user, so that when Hussein demanded the files by clicking PLAY, it would get to him quicker. Cool.
It’s important to note that for live streaming, each video file is a 6-second snippet, not one massive file. Live video is packaged in 6-second intervals and delivered with a 6 second delay to the user. If my maths was any good, there were ten 6-second snippets in each minute. 90 minute games meant at least 900 snippets. iflix made four different video profiles available when the HD source was transcoded, meaning the CDN carried 3,600 video snippets per regulation-length match with no over time. AND THAT’S JUST ONE GAME! Belal nodded. Phew! I was still getting it.
Now, back to Hussein. When he presses PLAY in Y, his iflix Player determines his network speed (the bitrate) and requests to watch live football at the best possible quality for his network speed. The manifest announces to the Player where it can find the relevant video files in the CDN, before they’re downloaded onto the Player.
But how do video files get to the CDN? Belal explained: They get there only if someone requests them. The request only has to happen once. If Hussein happened to be the first person to request these files, the CDN would get these video files from MediaPackage. Once the video files were in the CDN, they were available to anyone else requesting them without needing to trace back to MediaPackage. Magic! But why didn’t the Player just get the files directly from MediaPackage? Why the CDN? The CDN improved the distribution channels, taking the heat off MediaPackage so it wouldn’t be overwhelmed. By design, MediaPackage is not intended to handle tens of thousands of concurrent requests. The CDN isn’t restricted and comes into play to serve this traffic. Thanks CDN, you trooper!
- Camera records – the source!
- Footage is encoded and transported to MediaLive servers via protocols AKA the Internet.
- Once there, the footage is transcoded into 4 profiles which are packaged.
- When a user wants to watch that video, the manifest requests it from the CDN. If the CDN has it, great, the series of 6-second videos are downloaded to the user’s Player. If the CDN doesn’t have it yet the CDN asks MediaPackage for them.
It would have been great if that as the whole story, but alas, it’s not so simple – I just simplified it.
As you can imagine, with live streaming, like childbirth, things can – and do – go wrong. That’s why it’s imperative that there’s a stable internet connection and a team on standby ready to fix things on the fly, even if it means catering to obscene timezone differences between X and Y. It’s the things we do for love.
And as with a pregnancy, if it all comes together, glitch free, iflix goes live.