What is SOAP: Formats, Protocols, Message Structure, and How SOAP is Different from REST
Every time you log in to a website with your Facebook account or drag a drop-off pin across a Google map in the ride-hailing app, the application you use communicates with Google or Facebook via a web API. An API or an application programming interface is a form of agreement between web services on how they are going to exchange data, e.g. retrieve a map or your account credentials. The data itself is structured in messages that systems send to each other. Once you open, say, the Uber app, your phone sends a message request to Google Maps, and Google returns the map itself.
And if you’ve ever dealt with web services, you probably know that there’s more than one way to build a web API. The modern web is ruled by APIs that use the REST pattern. It’s a lightweight and efficient data exchange. But sometimes, you’d come across another approach – the SOAP protocol. It doesn’t brag about its simplicity and it’s not as fast as REST. But you’d be surprised how common it is in corporate data exchange, because SOAP has its merits.
In this article, we’ll figure out how SOAP works, why it’s so common across corporate users, and how it differs from REST.
Time for some words of caution: This article uses tech terms like server, client, protocols, etc. Even though most of them are explained, if you’re still uncertain, have a look at our beginner-friendly article about web architecture. It’s a handy jumping-off point for those of you just sticking your tech toes in the water.
What is SOAP?
SOAP or Simple Objects Access Protocol is a web communication protocol designed for Microsoft back in 1998. Today, it’s mostly used to expose web services and transmit data over HTTP/HTTPS. But it’s not limited to them. SOAP, unlike the REST pattern, supports the XML data format only and strongly follows preset standards such as messaging structure, a set of encoding rules, and a convention for providing procedure requests and responses.
The built-in functionality to create web-based services allows SOAP to handle communications and make responses language- and platform-independent.
While most web data exchange happens over REST exchange, SOAP isn’t disappearing anytime soon, because it’s highly standardized, allows for automation in certain cases, and it’s more secure. Let’s have a look at the main SOAP features.
SOAP works with XML only
Web-transmitted data is usually structured in some way. The two most popular data formats are XML and JSON.
XML (or Extensible Markup Language) is a text format that establishes a set of rules to structure messages as both human- and machine-readable records. But XML is verbose as it aims at creating a web document with all its formality. JSON, on the other hand, has a loose structure that focuses on the data itself. Have a look at the image below to get the idea.
You see that numerous ending tags in XML make it much longer. Thanks PCMag for the image.
As we’ve mentioned, when sending requests and response messages within web applications, SOAP requires XML exchange between systems. And when the request is received, SOAP APIs send messages back XML-coded only.
Besides the data format, SOAP has another level of standardization – its message structure.
SOAP message structure
XML isn’t the only reason SOAP is considered verbose and heavy compared to REST. It’s also the way SOAP messages are composed. Standard SOAP API requests and responses appear as an enveloped message that consists of four elements with specific functions for each one.
Headers and fault elements are optional
Envelope is the core and essential element of every message, which begins and concludes messages with its tags, enveloping it, hence the name.
Header (optional) determines the specifics, extra requirements for the message, e.g. authentication.
Body includes the request or response.
Fault (optional) shows all data about any errors that could emerge throughout the API request and response.
An example of the SOAP message. Image source: IBM
SOAP extensibility with WS standard protocols
That said, SOAP itself provides basic structural elements of the message. But it doesn’t dictate what goes into headers and bodies. Basically, you can customize these contents as appropriate.
But as web applications generally solve common sets of problems, after the SOAP release, the main protocol has been augmented by numerous standard protocols that specify how you do things. All these protocols are usually marked WS-(protocol name), e.g. WS-Security, WS-ReliableMessaging. They were contributed by different organizations, including Microsoft, IBM, OASIS, and others.
Standard protocols cover multiple areas and facets of SOAP use:
- Metadata, etc.
The cool thing about these protocols is that you can choose which of those you use. This is usually described as SOAP extensibility. For instance, if you need your financial transactions to be secure, you can apply WS-Atomic Transaction that are ACID-compliant.
ACID stands for Atomicity, Consistency, Isolation, and Durability, which is an enterprise-grade transaction quality and one of the reasons why SOAP is still used when exchanging sensitive information in enterprise architectures.
ACID compliance means that transactions meet the following requirements:
Atomicity. Multiple connected transactions either work as a single unit or don’t work at all. Sometimes this is called an all-or-none approach. This set of transactions is compared to an atom, which consists of multiple tightly connected elements.
Consistency. If some part of a transaction fails, the system rolls back to its initial state.
Isolation. Transactions are independent of each other.
Durability. Even if the system fails, completed transactions remain.
If you use WS-Atomic Transaction, which is another standard protocol, you’ll be able to achieve ACID-compliance.
Web Service Description Language (WSDL) document
One of the major features of SOAP APIs is that they almost always use a WSDL document.
Simply put, a WSDL document is an XML description of a web service. It acts as a guideline of how to communicate with a web service, defining the endpoints and describing all processes that could be performed by the exposed applications. These may include data types being used inside the SOAP messages and all actions available via the web service. Thus, a WSDL file serves as a signed contract between a client and a server.
This is how WSDL document may look. Image source: Researchgate.net
The cool thing about WSDL is that it allows you to generate client-side code in various languages and start messaging the server right away. While not all SOAP APIs leverage WSDL documents, their use is so popular because it helps different programming languages and IDEs quickly set up the communication.
More on technical documentation in our dedicated article.
Transfer protocols: HTTP, TCP, SMTP, FTP, and more
In layman terms, a transfer protocol is a set of rules and commands used to transfer data over the internet. There are low-level protocols like IPv4, which simply delivers data packets from one point to another. There are higher transfer layers, like TCP, which ensures that data is indeed delivered. And, finally, there are application-level protocols that are used by web browsers to communicate with web servers, but don’t take charge of the connection itself.
SOAP supports a variety of transfer protocols, both high- and low-level ones. For instance, SOAP allows for messaging via TCP (Transaction Control Protocol), a low-level data exchange method that works between ports via an IP network. You can go for the SMTP (Simple Mail Transfer Protocol) option, which is a communication protocol for electronic mail transmission, FTP (File Transfer Protocol), and any other transfer method that supports text data exchange.
Does it make any sense to send data using other protocols than HTTP/HTTPS? In most cases, it doesn’t. SOAP was primarily designed to work with HTTP. But there may be scenarios, such as security constraints, server requirements, solution architectures, or simply speed that will benefit from this SOAP versatility.
SOAP is appreciated for its ability to integrate the WS-Security feature. This set of protocols determines how to implement security inside the transactions and suggests data privacy and integrity. Also, it allows for encryption and cryptographic signing.
What WS-Security does is allow your messages to be encrypted not only by HTTPS (which already contains encryption), but also on the message level, having authentication data in the header element. It’s needed to make sure that if your data travels out of HTTPS when it reaches the server, it can only be read by the correct process inside this server, rather than the correct server itself. As there can be some data preprocessing happening on the server side before the message reaches its designated process.
That’s how WS-Security works with the message structure
Vittorio Bertocci, a Microsoft principal program manager, explained how WS-Security works using a naked motorcyclist metaphor.
Imagine your message as a naked motorcyclist. To reach the destination, he can drive through a transparent tunnel and hope that nobody sees him (HTTP). Or he can drive through an opaque tunnel. In this case, while nobody sees him when he’s inside the tunnel, to reach the final destination, he still must ride across some streets (HTTPS is an opaque tunnel, obviously). And finally, he can just wear clothes and a helmet to feel completely secure (WS-Security).
This message-level security is why financial organizations and other corporate users opt for SOAP.
SOAP stateful and stateless messaging
The beginning of the 21st century is remembered for the internet boom. Thousands of internet-driven companies were emerging and millions of users were accessing the web every day. Now imagine that a single server starts receiving thousands of requests from users (clients) simultaneously. And if this resource does something more complex than show walls of text, things can get slow. For instance, if users check the upcoming flights schedule and must drill down to each flight detail, the server must be aware of what’s happening with the client, right?
It appears that you can handle this situation in two ways: using stateful and stateless operations. And SOAP supports both.
Stateful means that the server keeps the information that it receives from a client across multiple requests. For instance, first it memorizes the flight dates that you’re looking for and then provides information on the pricing after the second request. This allows you to chain messages together, making the server aware of the previous requests. Stateful messaging may be crucial in operations involving multiple parties and complex transactions, e.g. bank operations or flight booking. But still, it’s really heavy to a server.
Stateless communication means that each message contains enough information about the state of the client so that a server doesn’t have to be bothered with it. Once the server returns requested data, it forgets about the client. Each request is isolated from the previous one. Stateless operations helped reduce server load and increase the speed of communication.
Stateful operations is one of the reasons SOAP is used for bank transactions and other data exchange that requires chaining messages. More on SOAP use cases below.
When building a SOAP API, developers can integrate successful/Retry logic. To put it simply, if something goes wrong, a requesting party gets the XML message with an error code and its explanation. So a client-side developer understands the reason behind the failure and can tweak the request to get a successful response. This feature adds some confidence to the development process since you don’t have to manually search the problem. SOAP has a default specification to establish the response format.
SOAP is versatile, powerful, and very standardized. But the thing is, sometimes you don’t need the interface to be that rich. And SOAP has several disadvantages that easily tip the scale in favor of REST for the majority of engineers and their organizations.
Some SOAP disadvantages to consider
Resource-consuming. Due to the larger size of an XML-file and a payload created by the massive structure of a message, a SOAP API requires a larger bandwidth. Sometimes, this trade-off isn’t worth dealing with. Simply put, it’s slow to process these strings of tags that XML messages abound with.
Hard learning curve. Being protocol-based, building SOAP API servers requires knowledge and understanding of all protocols you may use with it. Developers dealing with building these types of APIs should dive deep into all processes inside the protocol with its highly restricted rules.
Lacks flexibility. We’ve mentioned that a SOAP API serves as a strict contract between a client and a server. With this in mind, this rigid SOAP schema requires additional effort to add or remove the message properties on both sides of the communication, the server and the client. It makes updating requests and responses a tedious process and slows down adoption.
Getting started with SOAP: key sources
If you’re just embarking on SOAP engineering, here are the main links you should check:
SOAP Documentation – the key source of truth for those beginning work with SOAP
SOAP versions – as there were multiple iterations of the protocol, check these versions of SOAP
WSDL – how to use Web Services Description Language and create WSDL documents
WS-Addressing – how to add routing information to SOAP headers
WS-ReliableMessaging – the extension to make sure that the messages arrive at their destinations. It also helps with making chains of messages
WS-Coordination – coordinating actions of distributed applications
WS-Security – how to enable message-level protection
WS-Atomic Transaction – how to make messages ACID-compliant
How SOAP compares to REST
When describing SOAP, we must mention its main alternative – REST.
REST or representational state transfer is an architectural style, rather than a protocol. What this means is that REST provides much more flexibility in terms of how you structure your message, which format you use, and how the client and the server scale. SOAP, on the other hand, requires tight coupling between client and server. If either side changes something, things go wrong, hence its protocol nature.
REST was introduced in 2000 – it’s not much younger than SOAP – with an idea of making servers care less about what’s happening on the client.
And here’s where one of the main differences between REST and SOAP begin.
As most engineers will tell you, SOAP and REST can’t be directly compared, but since both approaches deal with solving a similar set of problems here’s a short breakdown
Stateful and stateless operations. REST is designed to be stateless; SOAP supports both approaches.
Message structure. While the SOAP message is an ”envelope,” the REST message is on a “postcard”: It has no extra wrappings or headers, or anything else that would alter its lightweight nature.
Logic exposure. In contrast to SOAP that keeps its logic in the WSDL document, REST has its alternative – a WADL document (or Web Application Description Language doc). It’s not as common as WSDL, but sometimes it’s useful if you operate in a corporate environment and you can’t easily contact people from the service side, requiring you to have some formal conventions at hand.
Transfer protocols. SOAP is flexible in terms of transfer protocols to accommodate for multiple scenarios. REST is solely focused on HTTP/HTTPS exchange. There may be some exceptions if you map HTTP methods of exchange (GET, POST PUT, DELETE, etc.) to, say, FTP methods. But REST was designed with HTTP in mind.
Caching. Caching means storing some information on the client-side to avoid additional load on the server. For instance, you may cache non-dynamic content like images to load them faster on the client-side and avoid requesting a server to do it every time you visit a resource. REST allows you to cache data on the HTTP level. If you want to implement SOAP-caching, you have to configure an additional cache module. Generally, REST is more cache-friendly.
Message size. The absence of the overhead text and code blocks in the plain JSON file as compared to bulky XML in SOAP results in substantial size reduction. That is to say, a modest RESTful API’s JSON file is easier and faster to process and transfer.
Learning curve. RESTful architecture is straightforward and simple to attain. SOAP requires a much deeper understanding of standards and additional WS protocols. On top of that, the engineering community that deals with REST is larger. So, you may expect to find answers to problems much faster.
Error-handling. In contrast to a SOAP API where specification allows for returning the Retry XML message with error code and its explanation, a REST API leaves less room for transparency. REST mainly provides two options: The answer may contain the error code without any explanation. This is a default feature. On the other hand, technology allows for the manual prescription of error object along with its code.
Security. A REST API uses Secure Sockets Layer (or SSL) along with HTTPS on top of HTTP, having a simple transport mechanism as the encryption method. HTTPS coverage acts as a shield for data security. And SSL security protocol is applied over an HTTPS connection to verify REST APIs calls. With SOAP you can also use SSL, including TCP-messaging, on top of the message-level security.
SOAP Use Cases
Considering these differences, it gets obvious why web messaging is mostly done with REST. To wrap things up, let’s define the cases when SOAP is still the major technology.
Highly standardized operations: billing, navigation, facilities. All use cases where you have to eliminate any kind of misinterpretation are a good fit for SOAP communication. Usually, these systems have strict contracts with clearly defined logic that can be described with a WSDL document.
Bank transactions and payment systems. When you need your transactions to always be reliable and non-reachable by third parties, SOAP has multiple benefits to consider. First, it’s the level of security with ACID compliance and WS-Security protocols. Additionally, this set of use cases usually requires stateful messaging, i.e. using chained transactions that aren’t isolated one from another. Since payment systems may have multiple parties involved in a single operation, SOAP allows for better coordination of their behavior.
Flight booking systems. Since flight booking usually involves multiple parties, some providers from this industry still rely on SOAP to handle stateful and chained messaging.
Non-HTTP messaging and legacy environments. If the server requirements and existing systems leverage communication protocols besides HTTP, SOAP is the first option to look at.
This article is part of a series that covers various approaches to digital communication systems and standards. You may also want to check out: