In cloud-native environments, the use of service mesh technologies to connect different operations is a growing trend. What isn't always clear with a service mesh though is how well it's actually working.
MeshMark, an open source effort jointly announced by Intel and service mesh startup Layer5 on May 18, looks to help organizations measure and quantify the performance and value of a service mesh deployment. MeshMark is part of the Service Mesh Performance project at the Cloud Native Computing Foundation (CNCF), which hosted its ServiceMeshCon EU event on May 18, co-located alongside KubeCon EU 2022.
"We're missing some performance characteristics," Lee Calcote, founder and CEO of Layer5 and co-chair of the CNCF Technical Advisory Group (TAG) Network, said during a ServiceMeshCon EU session. "We have a need for a clear and concise way of conveying the characters and the performance of an environment."
MeshMark Brings Metrics to Cloud-Native Service Mesh
The Service Mesh Performance project at its core is an effort to define specifications for capturing the details of a cloud-native environment in a uniform and consistent way, according to Calcote.
The Service Mesh Performance project looks at capturing infrastructure and service mesh configuration details and then providing a means to characterize the details of running workloads. The project also aims to provide the details in a consistent approach that can enable organizations to develop a baseline for an environment, as well as benchmark in a consistent way.
Simply being aware of what's running in a service mesh isn't quite enough though, and there is a need for a performance metric, which is where the new MeshMark effort comes into play. Mrittika Ganguli, principal engineer and network architect at Intel, explained that MeshMark is a cloud-native value measurement.
"With MeshMark, you're essentially trying to measure if the performance of your infrastructure matches what kind of business value you want to get from your deployment," Ganguli said.
A business value could be defined with key performance indicators on, for example, how well video gets loaded on a particular webpage, she said.
"If you click on something, you may often see the text get rendered first and then the video," Ganguli said. "The load latency of the video traffic is what impacts what you see visually."
MeshMark aims to provide metrics that an organization can use to determine how resources are being used. Ganguli said that in a cloud-native environment an organization is utilizing different kinds of resources. The utilization classes include, for example, compute or network or any other type of resource.
MeshMark provides an efficiency metric called Mesh Utilization Efficiency (MUE) that can help determine a score for a given resource utilization and the level of optimization.
The MeshMark score will be able to help identify what the load latency is for a given workload, given the available resources.
Overall, the goal with MeshMark is to take a number of different signals coming from a service mesh and combine them into an approach that can help organizations understand how well a deployment is, or isn't, working.
The need to better understand the performance of service meshes was further underscored by a report released on May 17 by the CNCF. The report found that 60% of surveyed organizations are using a service mesh in production today — and that key challenges for deployment of service meshes are a lack of guidance, blueprints, and best practices.