android - How to record video with AR effects and save it in flutter?

Question

Welcome To Ask or Share your Answers For Others

android - How to record video with AR effects and save it in flutter?

1 Reply

深蓝 · Answer 1 · 2021-01-29T04:52:35+0000

Not The Easiest Question To Answer, But...

I'll give it a go, let's see how things turn out. First of all, you should fill in some more details about the statements given in the question, especially what you're trying to say here:

I know there are packages like AR core and flutter_camera_ml_vision but these are not helping me.

How did you approach the problem and what makes you say that it didn't help you?

In the Beginning...

First of all, let's get some needed basics out of the way to better understand your current situation and level in the prerequisite areas of knowledge:

Do you have any experience using Computer Vision & Machine Learning frameworks in other languages / in other apps?
Do you have the required math skills needed to use this technology?
As you're using Flutter, my guess is that cross-platform compatibility is high priority, have you done much Flutter programming before and what devices are your main targets?

So, What is required for creating a Snapchat-like filter for use in live video recording?

Well, quite a lot of work happens behind the scenes when you apply a filter to live video using any app that implements this in a decent way.

Snapchat uses in-house software that they've built up over years, using technology acquired from multiple multi-million dollar company acquisitions, often established companies that specialized in Computer Vision and AR technology, in addition to their own efforts, and has steadily grown to be quite impressive through the last 5-6 years in particular.

This isn't something you can throw together by yourself as an "all night'er" and expect good results. But there are tools available for easing the general learning curve, but these tools also require a firm understanding of the underlying concepts and technologies being used, and quite a lot of math.

The Technical Detour

OK, I know I may have went a bit overboard here, but this is fundamental building blocks, not so many are aware of the actual amount of computation needed for seemingly "basic" functionality, so please, TLDR; or not, this is fundamental stuff.

To create a good filter for live capture using a camera on something like an iPhone or Android device, you could, and most probably would, use AR as you mentioned you wanted to use in the end, but realize that this is a sub-set of the broad field of Computer Vision (CV) that uses various algorithms from Artificial Intelligence (AI) and Machine Learning (ML) for the main tasks of:

Facial Recognition Given frames of video content from the live camera, define the area containing a human face (some also works with animals, but let's keep it as simple as possible) and output a rectangle suitable for use as a starting point in (x, y, for width & height).

The analysis phase alone will require a rather complex combination of algorithms / techniques from different parts of the AI universe, and this being video, not a single static image file, this must be continuously updated as the person / camera moves, so it must be done in close to real-time, in the millisecond range.

I believe different implementations combining HOG (Histogram of Oriented Gradients) from Computer Vision and SVMs (Support Vector Machines / Networks) from Machine Learning are still pretty common.
Detection of Facial Landmarks This is what will define how well a certain effect / filter will adapt to different types of facial features and detect accessories like glasses, hats etc. Also called "facial keypoint detection", "facial feature detection" and other variants in different literature on the subject.
Head Pose Estimation Once you know a few landmark points, you can also estimate the pose of the head. This is an important part of effects like "face swap" to correctly re-align one face with another in an acceptable manner. A toolkit like OpenFace (Uses Python, OpenCV, OpenBLAS, Dlib ++) contains a lot of useful functionality, capable of facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation, delivering pretty decent results.
The Compositing of Effects into the Video Frames After the work with the above is done, the rest involves applying the target filter, dog ears, rabbit teeth, whatever to the video frames, using compositing techniques. As this answer is starting to look more like an article, I'll leave it to you to go figure if you want to know more of the details in this part of the process.

Hey, Dude. I asked for AR in Flutter, remember?

Yep. I know, I can get a bit carried away. Well, my point is that it takes a lot more than one would usually imagine to create something like you ask for.

BUT. My best advice if Flutter is your tool of choice would be to learn how to use the Cloud Based ML services from Google's Firebase suite of tools, Firebase Machine Learning and Google's MLKit.

Add to this some AR specific plugins, like the ARCore Plugin, and I'm sure you'll be able to get the pieces together if you have the right background and attitude, plus a good amount of aptitude for learning.

Hope this wasn't digressing too far from your core question, but there are no shortcuts that I know of that cuts more corners than what I've already mentioned.

Categories

android - How to record video with AR effects and save it in flutter?