LiDAR data in ARKit 3.5

LiDAR data in ARKit 3.5
July 26, 2020
One of the many interesting projects we currently work on in SABO Mobile IT is an iOS app based on ARKit for our long-time client Audi in cooperation with NavVis. We gather information about real objects‘ future/planned locations and render it in augmented reality as precisely positioned as ARKit allows us. We've encountered multiple technical challenges on the way, most of which were more or less related to the positioning. The considerable inaccuracy of ARKit has been the biggest issue, and this has required many hours of research and experiments in order carryout the project.

We have been trying out a new iPad Pro with LiDAR and ARKit 3.5. We were keen to learn  what kind of data ARKit can provide us and what exactly has changed since the last version. I would like to share the results with you

ARKit 3.5

Anchors

ARKit solves mapping of virtual objects to real-world surfaces via anchors. Every anchor carries a piece of information about its transform (position, orientation, scale) in virtual 3D world space. By using this information, we are able to render our entities at a well fitting position/orientation/scale. Therefore, when it is rendered over a video stream coming from the camera, it looks like it is really there.

The first version only allowed developers to put objects on horizontal planes. Step by step, the API has been extended and now (in the last major release at date of writing, which is ARKit 3) we have 4 types of anchors: ARPlaneAnchor (vertical and horizontal planes), ARImageAnchor (pre-trained image), ARObjectAnchor (pre-trained 3D object ), and ARFaceAnchor (human face).

ARKit 3.5 introduced a new type of anchor – ARMeshAnchor. As you may have already determined from its name, ARMeshAnchor does not only transform – by collecting data from LiDAR, it also provides information about geometry of surroundings.

Raw data

Accessing geometry provided by ARMeshAnchor is done via var geometry: ARMeshGeometry { get } property.
Let's now look more closely at the new structure ARMeshGeometry (https://developer.apple.com/documentation/arkit/armeshgeometry):

{% c-block language="swift" %}
/**
A three-dimensional shape that represents the geometry of a mesh.
*/
@available(iOS 13.4, *)
open class ARMeshGeometry : NSObject, NSSecureCoding {
       
       /*
       The vertices of the mesh.
       */
       open var vertices: ARGeometrySource { get }
       /*
       The normals of the mesh.
       */
       open var normals: ARGeometrySource { get }
       /*
       A list of all faces in the mesh.
       */
       open var faces: ARGeometryElement { get }

       /*
       Classification for each face in the mesh.
       */
       open var classification: ARGeometrySource? { get }
}
{% c-block-end %}

Vertices, normals, and classification are represented by a new class ARGeometrySource. Apple documentation says it is mesh data in a buffer-based array. The type of data represented in the buffer is described by MTLVertexFormat.

So ARGeometrySource points to a piece of MTLBuffer, which is an array of vectors with count elements. The vector itself is also an array with fixed length (stride bytes), described in format.

By using a cut-and-try method I found out that for vertices and normals, the format is MTLVertexFormat.float3(i.e., 3 floats representing the X, Y, and Z coordinates of vertex / vector respectively).

The format for classification is MTLVertexFormat.uchar, which represents the raw value of ARMeshClassification enumumeration.

What I also found out was that the amount of normals is the same as the amount of vertices, not faces. This fact neither matches the documentation, which describes the normals property as rays that define which direction is outside for each face, nor does it fit the general meaning of a normal.

The next data type is ARGeometryElement, which is used for description of faces. It also contains a Metal buffer: MTLVertexFormat. The buffer contains an array of array of vertex indices. Each element in the buffer represents a face. Each face is represented by a fixed amount of numbers (indexCountPerPrimitive), and each number is a vertex index.

Hands-on approach

What I found interesting to explore is how ARKit assigns ARMeshAnchor to the feature points. RealityKit provides a visualisation for debugging purposes, but it only shows a grid based on geometries of all anchors. It's not possible to say what a single anchor's geometry looks like.

Procedural meshes

Generating and presenting a virtual object based on mesh data from LiDAR in real-time can be very cool, right?

Unfortunately, the procedural meshes in RealityKit are still not fully supported. You definitely can generate primitives by using MeshSource.generateBox and similar, but not a complex mesh. The geometry provided in ARMeshAnchor is a set of faces and you won't be able to represent it by using primitives. It is not a surprise that RealityKit is still quite new and is on its path to maturity.

There is still one way in RealityKit though. You can generate a MDLAsset via Model I/O framework, export it to usdz format, and import it again into RealityKit via ModelEntity.load(contentsOf:withName:). But you may experience high latency in real-time use due to I/O operations with the file system.

Where I really did succeed in runtime mesh generation was with a SceneKit, which allows you to dynamically create SCNGeometry and assign it to a SCNNode.

With a few handy extensions the code can look like this:

{% c-block language="swift" %}
/* MARK: **- ARSCNViewDelegate** */

func renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode? {
/*
Create a node for a new ARMeshAnchor
We are only interested in anchors that provide mesh
*/
guard let meshAnchor = anchor as? ARMeshAnchor else {
return nil
}
/* Generate a SCNGeometry (explained further) */
let geometry = SCNGeometry(arGeometry: meshAnchor.geometry)

/* Let's assign random color to each ARMeshAnchor/SCNNode be able to distinguish them in demo */
geometry.firstMaterial?.diffuse.contents = colorizer.assignColor(to: meshAnchor.identifier)

/* Create node & assign geometry */
let node = SCNNode()
node.name = "DynamicNode-\(meshAnchor.identifier)"
node.geometry = geometry
return node
}

func renderer(_ renderer: SCNSceneRenderer, didUpdate node: SCNNode, for anchor: ARAnchor) {
/* Update the node's geometry when mesh or position changes */
guard let meshAnchor = anchor as? ARMeshAnchor else {
return
}
/* Generate a new geometry */
let newGeometry = SCNGeometry(arGeometry: meshAnchor.geometry)  /* regenerate geometry */

/* Assign the same color (colorizer stores id <-> color map internally) */
newGeometry.firstMaterial?.diffuse.contents = colorizer.assignColor(to: meshAnchor.identifier)

/* Replace node's geometry with a new one */
node.geometry = newGeometry
}
{% c-block-end %}

Conversion of ARMeshGeometry to SCNGeometry is pretty straightforward as the structures are very similar:

{% c-block language="swift" %}
extension  SCNGeometry {
       convenience init(arGeometry: ARMeshGeometry) {
              let verticesSource = SCNGeometrySource(arGeometry.vertices, semantic: .vertex)
              let normalsSource = SCNGeometrySource(arGeometry.normals, semantic: .normal)
              let faces = SCNGeometryElement(arGeometry.faces)
              self.init(sources: [verticesSource, normalsSource], elements: [faces])
       }
}
extension  SCNGeometrySource {
       convenience init(_ source: ARGeometrySource, semantic: Semantic) {
              self.init(buffer: source.buffer, vertexFormat: source.format, semantic: semantic, vertexCount: source.count, dataOffset: source.offset, dataStride: source.stride)
       }
}
extension  SCNGeometryElement {
       convenience init(_ source: ARGeometryElement) {
              let pointer = source.buffer.contents()
              let byteCount = source.count * source.indexCountPerPrimitive * source.bytesPerIndex
              let data = Data(bytesNoCopy: pointer, count: byteCount, deallocator: .none)
              self.init(data: data, primitiveType: .of(source.primitiveType), primitiveCount: source.count, bytesPerIndex: source.bytesPerIndex)
       }
}
extension  SCNGeometryPrimitiveType {
       static  func  of(_ type: ARGeometryPrimitiveType) -> SCNGeometryPrimitiveType {
              switch type {
              case .line:
                      return .line
              case .triangle:
                      return .triangles
              }
       }
}
{% c-block-end %}

And here is the result:

As we can see, the ARKit produces "square" meshes approximately 1 m x 1 m. Some of them may overlap with each other.

Conclusion

Similarly to the early RealityKit (a new way to primarily work with augmented reality) releases where many features were missing, the new version of ARKit 3.5 came with a simple and quite small API covering a big enhancement. It also includes one more anchor type and a few new classes in a computer graphics style. I personally found it more interesting to check out these new entities in a practical way. It helped me to figure out how to work with it and also showed me interesting facts which the documentation does not describe.  This first finding that I described can be explained as a way in which Apple simplifies a complex problem. But it is only the first part of what I discovered: Stay tuned to read more.

Share:
Nikita is a senior Front-End Developer who has been keen on programming since 2006. He enjoys not only developing complex SPA web applications but also small and medium-sized iOS applications. He is a fast learner, innovative mind, fun team-mate, and allegedly the biggest geek in SABO.

Other articles by same author

Article collaborators

Pavel Zdeněk
SABO Newsletter icon

SABO NEWSLETTER

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

About SABO Mobile IT

We focus on developing specialized software for our customers in the automotive, supplier, medical and high-tech industries in Germany and other European countries. We connect systems, data and users and generate added value for our customers with products that are intuitive to use.
Learn more about sabo