Measure Node.js server response time with N|Solid

As software developers, we constantly face new challenges in an ever-changing ecosystem. However, we must always remember the importance of addressing performance and security concerns, which remain at the top of our priority list.

To ensure that our applications based on Node.js can meet our performance and scalability needs without compromising security or incurring costly infrastructure changes, we must be aware of the importance of network optimization in Node.js.

The Impact of Latency/Ping Time on the Performance and Speed of Your Node.js Application

IMG – Ping Cats – via GIPHY

This communication, known as network ping time or latency, is a crucial factor that impacts the performance and speed of your application. Knowing how to measure network ping time between the browser and the server is essential for developers who want to optimize their applications and provide a better user experience. _Have you ever wondered how long it takes for your application to communicate with the server? _

Network Optimization in Node.js

To ensure the optimal performance and scalability of our Node.js applications, we must accurately measure our HTTP server’s connection and response time. Doing so enables us to identify and address potential bottlenecks without compromising security or incurring unnecessary infrastructure changes.

Before delving deeper into measuring connection and response time, let’s explore fundamental concepts and critical differentiators in the network landscape.

HTTP vs. WebSocket:

HTTP and WebSocket are communication protocols used in web development but serve different purposes. HTTP is a stateless protocol commonly used for client-server communication, while WebSocket enables full-duplex communication between clients and servers, allowing real-time data exchange.

Types of Connections and Versions:

When creating APIs, HTTP as a protocol and standard has different versions, such as HTTP 1.1 and 2.0. Additionally, APIs may use alternative protocols like gRPC, which offer different features and capabilities. Understanding these options empowers developers to choose the most suitable tools for their web servers.

TCP/IP Basics:

The Transmission Control Protocol (TCP) and Internet Protocol (IP) are fundamental protocols that form the backbone of computer networks. Among TCP’s critical processes is the three-way handshake, which plays a vital role in establishing a secure and dependable connection between two endpoints. This handshake ensures the orderly and reliable transmission of data. TLS/SSL encryption enhances security, adding an extra layer of protection to the communication between the client and the server.

HTTP vs. HTTPS:

HTTP operates over plain text, which exposes the data being transmitted to potential eavesdropping and tampering.
HTTPS, on the other hand, secures communication through the use of SSL/TLS encryption, providing confidentiality and integrity.
Understanding the trade-offs between HTTP and HTTPS is crucial to making informed data security decisions.

Building a Solid Foundation: Understanding the Three-Way Handshake for Reliable Connections

To evaluate the performance of our HTTP server, we need to differentiate between connection latency and server response time. Connection latency refers to the time it takes for the initial three-way handshake process to complete before data transmission can occur. On the other hand, server response time measures the duration from when the server receives a request to when it generates and sends the response back to the client.

The three-way handshake is a fundamental process in establishing a TCP (Transmission Control Protocol) connection between a client and a server in a network. It involves three steps, a “three-way handshake.” This handshake establishes a reliable and ordered communication channel between the two endpoints.

Here’s a breakdown of the three steps involved in the three-way handshake:

__SYN (Synchronize)__: The client initiates the connection by sending an SYN packet (synchronize) to the server. This packet contains a randomly generated sequence number to initiate the communication.
__SYN-ACK (Synchronize-Acknowledge)__: Upon receiving the SYN packet, the server acknowledges the request by sending an SYN-ACK packet back to the client. The SYN-ACK packet includes its own randomly generated sequence number and an acknowledgment number equal to the client’s sequence number plus one.
__ACK (Acknowledge)__: Finally, the client sends an ACK packet (acknowledge) to the server, confirming the receipt of the SYN-ACK packet. This packet also contains the acknowledgment number equal to the server’s sequence plus one.

Once this three-way handshake process is completed, the client and the server have agreed upon initial sequence numbers, and a reliable connection is established between them. This connection allows for data transmission with proper sequencing and error detection mechanisms, ensuring that the information sent between the client and server is reliable and accurate.

The three-way handshake is essential to establishing TCP connections and is performed before any data transmission can occur. It plays a critical role in ensuring the integrity and reliability of the communication channel, providing a solid foundation for subsequent data exchange between the client and server.

Create a self-serve diagnostic tool for a server-rendered page in Node.js.

The idea is to share an easy-to-follow recipe that will help you create your tool, so let’s start with the ingredients and end with the steps to create a self-serve diagnostic tool for a server-rendered page in Node.js.

Ingredients:

Node.js & NPM installation – https://nodejs.org/

Fastify.js – https://www.fastify.io/

Instructions:

1. Setup a Node.js Project
Use NPM to create your Node project:

$ mkdir diagnostic-tool-nodejs
$ cd diagnostic-tool-nodejs
$ npm init -y

2. Install your NPM packages.
We have Fastify in our recipe, so we must install them first:

$ npm i fastify

3. Create the index.mjs
Create an index.mjs file in the project’s root directory and paste this fastify HTTP server sample code.

import Fastify from “fastify”;

const fastify = Fastify({
logger: true,
});

// Randomly create a timer from 100ms up to X seconds
function timer(time) {
return new Promise((resolve, reject) => {
const ms = Math.floor(Math.random() * time) + 100;
setTimeout(() => {
resolve(ms);
}, ms);
});
};

// Declare the root route and delay the response randomly
fastify.get(“/”, async function (request, reply) {
const wait = await timer(5000);
return { delayTime: wait };
});

// Run the server!
fastify.listen({ port: 3000 }, function (err, address) {
if (err) {
fastify.log.error(err);
process.exit(1);
}
});

This will start the server on port 3000, which you can access by going to http://localhost:3000 in your web browser.

Integrate with N|Solid Console

Be sure you already have N|Solid installed and running on your environment; otherwise, go to https://downloads.nodesource.com and get the installer.

Also, run the console using docker as an alternative to the local installation.

docker run -d -p 6753:6753 -p 9001:9001 -p 9002:9002 -p 9003:9003 nodesource/nsolid-console:hydrogen-alpine-latest

With the application already initialized with npm, Fastify installed, and our index.js in place, we can connect our process with N|Solid

Run the HTTP server with the NSOLID RUNTIME following the instructions on the principal console page.

IMG – Connect N|Solid

In this case, we ran the process by passing the config via environment variables and running a local installation of the Nsolid console.

NSOLID_APPNAME=”NSOLID_RESPONSE_TIME_APP” NSOLID_COMMAND=”127.0.0.1:9001″ nsolid index.mjs

If you instead use our SaaS console, you need to use the NSOLID_SAAS env instead of __NSOLID_COMMAND__.

NSOLID_APPNAME=”NSOLID_RESPONSE_TIME_APP” NSOLID_COMMAND=”XYZ.prod.proxy.saas.nodesource.io:9001″ nsolid index.mjs

After completing those steps, you should be able to watch the app and process connected to the console.

IMG – Connect N|Solid Process

GIF 1 – Connect N|Solid Process

Go to the application process and add the HTTP(S) Server 99th Percentile Duration metric to see in near-real time the HTTP server latency response time and also we have the HTTP(S) Request Median Duration.

GIF 2 – Monitor Process Metrics

After this, we should be able to generate some traffic and see how the response times behave with the sample code provided, generating some response time randomness from 100ms up to 5 secs.

To generate the traffic, we can use autocannon

npx autocannon -d 120 -R 60 localhost:3000

After running autocannon for some minutes, we can see the P99 metric of the HTTP Server. The median and compare them.

IMG – http-latency-response-time-metrics

IMG – http-request-median-duration

IMG – p99-metric

To fully utilize the metrics provided by N|Solid, it is crucial to have a comprehensive understanding of their significance. Two critical metrics offered by N|Solid are the 99th Percentile and the HTTP Median metric. These metrics play a vital role in assessing the performance of Node.js applications in production environments. By getting deeper into their practical application and importance, we can unlock the actual value of these metrics in N|Solid and make informed decisions to optimize our production systems. Let’s explore this further.

The 99th Percentile metric

The 99th percentile is a statistical measure commonly used to analyze and understand response time or latency in a system.

Imagine you have a web application that handles incoming requests. To understand how fast the server responds, you measure the time it takes for each request and gather that data. You can find the 99th percentile response time by looking at the data.

For example, __the 99th percentile response time is 500 milliseconds__.
This means that only 1% of the requests took longer than 500 milliseconds to get a response. In simpler terms, 99% of the requests were handled in 500 milliseconds or less, which is fast.

It helps you identify and address any outliers or performance bottlenecks affecting a small fraction of requests but can significantly impact the user experience or system stability. Monitoring the 99th percentile response time helps you spot any slow requests or performance issues that might affect a few users but still need attention. but can have a significant impact on user experience or system stability.

The HTTP median metric

When sorted in ascending or descending order, the median represents a dataset’s middle value.

To illustrate the difference between the 99th percentile and the median, let’s consider an example. Suppose you have a dataset of response times for a web application consisting of 10 values:
[100ms, 150ms, 200ms, 250ms, __500ms__, 600ms, 700ms, 800ms, 900ms, 1000ms].

The median response time would be the middle value when the dataset is sorted, which is the 5th value, 500ms. This means that 50% of the requests had a response time faster than 500ms, and the other 50% had a response time slower than 500ms.

Connect with NodeSource

If you have any questions, please contact us at [email protected] or through this form.

Experience the Benefits of N|Solid’s Integrated Features
Sign up for a Free Trial Today

To get the best out of Node.js and experience the benefits of its integrated features, including OpenTelemetry support, SBOM integration, and machine learning capabilities. Sign up for a free trial and see how N|Solid can help you achieve your development and operations goals. #KnowyourNode

Enhance Observability with Opentelemetry tracing – Part 1

Recently, conversations have been increasing around OpenTelemetry; it is gaining more and more momentum in Node.js development circles, but what is it? How can we take advantage of the key concepts and implement them in our projects?

Of note, NodeSource is a supporter of OpenTelemetry, and we have recently implemented full support of the open-source standard in our product N|Solid. It allows us to make our powerful Node.js insights accessible via the protocol.

Opentelemetry is a relatively recent vendor-agnostic emerging standard that began in 2019 when OpenCensus and OpenTracing combined to form OpenTelemetry – seeking to provide a single, well-supported integration surface for end-to-end distributed tracing telemetry. In 2021, they released V1. 0.0, offering stability guarantees for the approach.

And most important, OpenTelemetry is an open-source observability project/framework with a collection of software development kits (SDKs), APIs, and tools for instrumentation from the Cloud Native Computing Foundation (CNCF).

W3C Trace Context is the standard format for OpenTelemetry. Cloud providers are expected to adopt this standard, providing a vendor-neutral way to propagate trace IDs through their services. Organizations use OpenTelemetry to send collected telemetry data to a third-party system for analysis.

But to break down its history a bit, we think it’s important to understand the concept of __Observability__.

At Nodesource, as you likely know, we work daily in Observability, focusing exclusively on the Node.js runtime and N|Solid from N|Solid 4.8.0 supports some OpenTelemetry features. But before getting deeper in OTEL, it is important to understand Observability and try to resolve this important question: What is Observability?

Setting the foundations to talk about OpenTelemetry

It’s important to understand that when we talk about Observability, we need first to know what questions we seek to answer or clarify when detailing a system.

The first question often asked is __why__ my application has specific behavior. And to solve this and other questions, we first must instrument our system so that our application can emit signals, that is, traces, metrics, and logs. When we correctly do this, we have the necessary information needed.

Observability is the ability to measure the internal states of a system by examining its outputs. – Splunk

Detailing a system through Data Collection: Telemetry Data

Your systems and apps need proper tooling to collect the appropriate telemetry data to achieve Observability.
But what is the telemetry data that we need?

The three key concepts are :

Metrics

Logs

Traces

Ok, let’s define each of these concepts:

Metrics

__Metrics__: are aggregations over a period of time of numeric data about your infrastructure or application. Examples include system error rate, CPU utilization, and request rate for a given service.

As quoted by isitobservable.io, OpenTelemetry has three metric instruments :

__Counter__: a value that is summed over time (similar to the Prometheus counter)
__Measure__: a value that is aggregated over time (a value over some defined range)
__Observer__: captures a current set of values ​​at a given time (like a gauge in Prometheus)

The context is still very important, along with metric information like name, description, unit, kind (counter, observer, measure), label, aggregation, and time.

Logs

__Logs__: A Log is a timestamped message emitted by services or other components. They are not necessarily associated with any particular user request or transaction, but they become more valuable when they are.

The logical line would tell us that here we must jump to traces because it is part of the three key concepts. But before defining what a trace is, we must zoom in on the concept of __Span__.

Span

__Span__: A Span represents a unit of work or operation. It tracks specific operations that a request makes, painting a picture of what happened during the time in which that operation was executed.

A span is the building block of a trace and is a named, timed operation representing a piece of the workflow in the distributed system. All traces are composed of Spans.

Traces

Traces__: A Trace records the paths taken by requests (made by an application or end-user) as they __propagate through multi-service architectures, like microservice and serverless applications. It is also known as Distributed Trace. A trace is almost always an assessment of end-to-end performance.

Without tracing, it is challenging to pinpoint the cause of performance problems in a distributed system.

Suppose you realize we broke into the three pillars of Observability when introducing the concept of Span. In that case, however, the three pillars and Span conform to what is known as __Telemetry Data__, which are simple __signals emitted from applications and resources about their internal state.__

The core concept of Context Propagation

When we want to correlate events across our services’ boundaries, we look for a context that helps us identify the current trace and Span. But context is not the only thing we need; we also need __propagation__.

If you are with us following the article carefully, you will realize that in the definition of trace, we talk about the word __‘propagation’__. You might wonder what this means.

Propagation is how context is bundled and transferred in and across services, often via HTTP headers. Now, With these clear concepts, we can begin to understand the concept of __Context Propagation__.

A critical functionality required to implement Distributed Tracing is the concept of Context Propagation. We can define it as a mechanism for storing state and accessing data across the lifespan of a distributed transaction, either across execution contexts inside a process or across the boundaries of the services that conform to our system.

For In Process propagation__, we typically use something like the __AsyncLocalStorage class from the async_hooks module.

Whereas Across Processes__, it will depend on the IPC protocol used. For example, for HTTP, there’s the Trace Context specification from the _W3C__, which defines the _traceparent and tracestate headers to propagate tracing info.

Getting into a Distributed Application

Let’s say we have a distributed application like the one in the picture. It has 4 Nodejs services: API, auth, Service1 and Service2, and 1 database.

Imagine we’re having intermittent performance issues. They could come from several points:
– Database access
– Network link status,
– DNS request latency, etc.
Finding where exactly may become a very hard and time-consuming task; the harder, the more complex the system is.

Distributed tracing will help us A LOT with that, as we’ll generate tracing information on every point of the distributed system (A, B, C, D, and E). Not only that, but while the request goes through all the services, thanks to Context Propagation, some ‘tracing state’ will be passed along so all the tracing info can be linked to the very same request.

Instrument your system

To get visibility into the performance and behaviors of the different microservices, we need to instrument the code with OpenTelemetry to generate traces. But first, let’s define what Instrumentation is…

Automatic Instrumentation

With __Automatic Instrumentation__, our instrumentation libraries will automatically take the configuration provided (through code or environment variables) and do most of the work.

In the following example, using the OpenTelemetry SDK, we show how we can automatically generate spans for every HTTP transaction handled by the Nodejs HTTP core module.

Manual Instrumentation

__Manual instrumentation__, on the other hand, while requiring more work on the user/developer side, enables far more options for customization, from naming various components within OpenTelemetry (for example, spans and the traces ) to add your own attributes, specific exception handling, and more. See the following example shows how to manually generate a Span using the OpenTelemetry SDK.

How to Implement Opentelemetry in my project?

The way we historically would implement a typical observability pipeline is shown in the following picture.

In this case, having all that data at your disposal is great and can give us a valuable overview of our system, but unless we are able to correlate the observability signals somehow: metrics, logs, and traces, we won’t be able to have the best of it.
OpenTelemetry comes to solve this problem. The solution is going to come from correlating these signals. This can be done by applying the same concept of Context Propagation that was used for Traces to Metrics and Logs, so in this case, identifiers such as the trace_id and the span_id are associated with those signals.

OpenTelemetry spanId and traceId can correlate Logs and Metrics with a specific Span in a Trace.

Opentelemetry Components

OpenTelemetry is much more… Before finishing this article, it is important to describe the different components of OpenTelemetry.

Note: For more detail, read the specification overview of the Opentelemetry project.

OpenTelemetry API: It provides an API, which defines data types and operations for generating and correlating tracing, metrics, and logging data.

💚 From N|Solid 4.8.0 we provide an implementation of the OpenTelemetry TraceAPI, allowing users to instrument their own code using the de-facto standard API.

OpenTelemetry SDK: It provides Language-specific implementations of the API.

OpenTelemetry OTLP: A protocol to transport the Telemetry Data.

💚 With N|Solid 4.8.0 we support many instrumentation modules available in the OpenTelemetry ecosystem. Supporting exporting traces using the OpenTelemetry Protocol(OTLP) over HTTP.

OpenTelemetry Collector: To receive, process, and export Telemetry data.

💚 In N|Solid 4.8.0 is now possible to send N|Solid Runtime monitoring information (metrics and traces) to backends supporting the OpenTelemetry standard like multiple APMS (Dynatrace, Datadog, Newrelic).

OpenTelemetry Semantic Conventions: To have well-defined naming for the attributes associated with the signals: (service.name, http. port, etc.)

We know that there may be other key concepts to develop around Opentelemetry, and for this reason, we invite you to visit the direct website of the project or the Github Repo directly.

This introductory article gives us the basis for sharing a demo we prepared for NodeConf.EU, where we apply open-source tools to implement Opentelemetry in your project. We invite you to stay tuned for our next blog post. 😉 Wait for the second part!

Conclusions

Traces are really useful for understanding modern distributed systems.
We build better software when we get the best of our traces.
With OTel (Opentelemetry), we’re able to have maximized insights and answer future questions without having to make any code changes.
#OTel provides interoperability with observability tools.
Collect and correlate telemetry data is easy if you follow the OpenTelemetry framework.
As far as we know, the OpenTelemetry community is working hard to develop support for metrics and logs. Waiting for news soon! 🤞
Note: If you want to learn more about OpenTelemetry in Javascript, click HERE

To start getting more value out of your traces and metrics, you can use Opentelemetry with N|Solid back-end.

Achieve Your Performance Goals With N|Solid

We know that you want to get the best out of your application and to do it professionally, you will surely need a great ally to help you with various tools without affecting your performance. We do not want to stay in a ‘marketing speech’ where we tell you that we are the best… you can 👀check it directly here with this Open Source tool that also includes OTEL Results.

We’d love to hear more from you! 💚
– Feel free to TRY N|Solid and get in touch with us on Twitter at @nodesource.