Vision Framework for Face Landmarks detection using Xamarin.iOS

Mobile devices are getting better and better at solving sophisticated tasks. Not only because of better hardware, but also due to modern trends towards AI – such tasks as face detection, barcode recognition, rectangle detection, text recognition, etc. are now supported on the operating system level making it really simple to solve them in your app. Here I am going to show how to detect face landmarks in real time using the Vision framework. The demo app that we’re going to build here is also available on GitHub.


The first thing to do is to configure an instance of AVCaptureSessionto capture the video stream from the front camera. We’re going to direct the stream to

  1. AVCaptureVideoPreviewLayerto preview it on the screen
  2. AVCaptureVideoDataOutputto perform the face landmarks detection

Let’s start with a small helper property to get the front camera AVCaptureDevice. We’re using the AVCaptureDeviceDiscoverySessionspecifying that we’re interested in the front camera.

Now the AVCaptureSessionitself.

Here we’re initiating the capture session by adding instances of the AVCaptureDeviceInput and AVCaptureVideoDataOutput classesWe’re setting AlwaysDiscardsLateVideoFramesto true to save some memory (well, it’s true by default, but let’s make it explicit). And also what’s important here is the OutputRecorder– our implementation of IAVCaptureVideoDataOutputSampleBufferDelegate which will do the face landmarks detection.

VNSequenceRequestHandler and VNDetectFaceLandmarksRequest

At this point, we have the configured AVCaptureSessionand we’re ready to process the output to detect face landmarks. To do this let’s override the DidOutputSampleBuffermethod.

The method is called every time there are new frames captured. We’re creating a CIImageand passing it to the DetectFaceLandmarksmethod which will use the Vision framework to detect face landmarks and draw on the overlay layer. Note that we need to properly dispose all objects, otherwise the app becomes unresponsive very quickly.

The method is quite simple. First, we initiate a new VNDetectFaceLandmarksRequestby specifying a handler which will iterate through all results and draw them (note that we’re doing the drawing on the UI thread). And second, we’re using the VNDetectFaceLandmarksRequestto perform the detection on the CIImagefrom the previous step.

And lastly, the DrawLandmarkmethod:

Since the Vision framework returns normalized points of landmarks we’re transforming them to the screen coordinates before drawing. The rest code is just about adding a new CAShapeLayerwith the drawn line.


Here I showed you how simple it is to perform such a complex task as the detection of facial landmarks. If you’re creating your own app that uses this feature, don’t forget to add an NSCameraUsageDescriptionto your info.plist. Also, keep in mind that the Vision framework is available on iOS 11+. Happy coding!

Write a Comment