❄️ Latest: Snowflake customers — Learn how to connect Snowflake to everything! ❄️

follow or visit us on
Announcements

Ockam and Redpanda partner to launch the world's first zero-trust streaming data platform.

Matthew Gregory
Matthew Gregory CEO
Published 2024-05-30
AnnouncementsOckam and Redpanda partner to launch the world's first zero-trust streaming data platform.

Ockam teamed up with Redpanda to launch Redpanda Connect with Ockam: the first zero-trust streaming data platform. This is a natural partnership because both companies have the same ethos; to enable every developer to build distributed systems, at scale, with simple tools. Both companies' products are based on popular open source projects and are built by veteran, high-performing teams.

Redpanda Connect with Ockam is the only enterprise-scale, zero-trust, streaming data platform that's simultaneously easy to deploy, secure-by-design, and highly performant.

“Enterprise executives tell me that their high value streaming applications take a long time to build and to ship,” said Matthew Gregory, Ockam's CEO. “Their biggest obstacle is the construction of secure streaming data connections across their siloed enterprise, and between partner companies. Secure data streaming is now simple to set up, trivial to maintain, and almost impossible to mess up. We've unlocked data so application development can move fast.”

Enterprise Kafka product vendors defer zero-trust implementation details to their customers. It's difficult, even for sophisticated engineering organizations, to pull together a team that can choose and assemble the dozens of parts required to build a secure-by-design streaming data platform. Redpanda Connect with Ockam empowers a single developer to easily create end-to-end encrypted streaming pipelines that can connect 100's of applications across clouds and hybrid infrastructure. It builds trust in data streams, eliminates sources of risk to critical data, and removes the operational complexities inherent in managing keys at scale.

Redpanda provides a declarative data streaming service that enables developers and organizations to build simple, chained, and stateless processing steps for their data pipelines. Our partnership with Redpanda brings a truly zero-trust capability to these pipelines. Ockam's cryptographic protocols and data integrity guarantees remove the risk of MITM and supply chain attacks that could otherwise tamper with data inflight and lead to data poisoning; highlighted as one of the OWASP Top 10 risks for companies that leverage AI and Large Language Model (LLM) systems. The mutual authentication and encrypted data protocols that Ockam provides also mean organizations can transmit highly sensitive data with cryptographic guarantees so only the intended recipients will be able to decrypt and process it. This will unlock new high value applications that businesses may have been unwilling to explore due to privacy and governance risk, or complexity.

“We help the world's largest companies build and scale streaming applications," said Alex Gallego, Redpanda's CEO. "Redpanda Connect with Ockam was conceived to empower our customers to build streaming data applications with ease. Redpanda Connect comes with over 200 pre-built integrations that will quickly connect to and unlock your data silos. Ockam provides the mutual authentication and end-to-end encryption that zero-trust streaming data applications require. Together with Ockam, we've abstracted away all of the complex pieces that cause engineering teams to stumble.”

Confluent's “2023 Data Streaming Report: Moving Up the Maturity Curve” recently surveyed 2,250 IT and engineering leaders from organizations that use streaming data.

  • 74% cited that their streaming data applications are blocked because of fragmented projects, uncoordinated teams, and constrained engineering budgets.
  • 32% claim that they successfully “reduced security risks” after implementing data streaming services across their applications. However, 94% have high-security expectations for these same applications.

These are alarming numbers! Despite the obvious benefits that a real-time streaming data architecture provides, companies struggle to ship data pipelines to their applications. Often, they're blocked for several quarters. Once they get to production, they realize that their data doesn't have the security guarantees that they need or expect from their streaming data infrastructure.

Redpanda Connect with Ockam aims to move these numbers to 0% and 100%, respectively. Let's go!

Connect, secure, and stream—all in one simple platform

Redpanda Connect with Ockam example flow

Connect with Redpanda Connect (formerly known as Benthos)

Redpanda Connect empowers engineers to build connectors and integrations to over 200 applications, including SQL, Mongo, Snowflake, Cassandra, Redis, Splunk, AWS Lambda, Azure CosmosDB, and GCP Cloud Storage. Redpanda Connect can process streaming data with transforms, mapping, filtering, hydration, and enrichment capabilities.

Secure with Ockam

Zero-trust in your data streaming infrastructure allows you to have absolute-trust in your data and applications. Ockam adds the guarantees of data confidentiality, integrity, and authenticity, so you can enforce data governance and privacy policies at scale.

Each producer creates its own unique cryptographically provable identity and encryption keys. The producers use their keys to establish trusted secure channels through Redpanda Cloud and to the consumers across the streaming platform. All the data that moves through your streaming data platform is encrypted in motion - even when it's passing through a broker. Keys, enrollments, and credentials are safely created, stored, rotated, and revoked automagically so there's almost nothing to manage.

Ockam makes it simple to Build Trust across your entire data layer - in a way that's almost impossible to mess up.

Stream with Redpanda Cloud

Redpanda Cloud is a complete streaming data platform delivered as a fully managed service with automated upgrades and patching, data and partition balancing, and 24×7 support. It continuously monitors and maintains your clusters along with the underlying infrastructure to meet strict performance, availability, reliability, and security requirements.

Give it a try

The example below will show you how to use Ockam to connect any input sources (e.g., MongoDB, Postgresql, a Kafka stream, etc.) to any output (e.g., Snowflake, S3, Splunk, etc.), automatically encrypting all your data in motion. You can run this complete example locally in less than 2 minutes by copying the relevant sections below. To understand in more detail each configuration and command, and how to extend this for other uses, read on below.

First, copy the code block below and save it to a file named consumer.yaml:


_23
# consumer.yaml
_23
_23
input:
_23
ockam_kafka:
_23
kafka:
_23
seed_brokers: [rp-node-0:9092]
_23
topics: [topic_A]
_23
consumer_group: example_group
_23
allow_producer: producer
_23
relay: consumer_relay
_23
enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}
_23
_23
pipeline:
_23
processors:
_23
- bloblang: |
_23
root = this
_23
root.data.message = this.data.message.uppercase()
_23
_23
output:
_23
stdout: {}
_23
# snowflake_put:
_23
# account: acme
_23
# ...

Next, save the code block below to producer.yaml


_24
# producer.yaml
_24
_24
input:
_24
generate:
_24
count: 1000
_24
interval: "@every 1s"
_24
mapping: |
_24
root = {
_24
"_producer": hostname(),
_24
"data": { "email": fake("email"), "message": fake("sentence") }
_24
}
_24
# mongodb:
_24
# url: mongodb://localhost
_24
# database: orders
_24
# ...
_24
_24
output:
_24
ockam_kafka:
_24
kafka:
_24
seed_brokers: [rp-node-0:9092]
_24
topic: topic_A
_24
route_to_consumer: /project/default/service/forward_to_consumer_relay/secure/api
_24
allow_consumer: consumer
_24
enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}

All that's left is to run the commands below to start the services, and data will start flowing. These commands use Homebrew to install tools and Docker to run containers, so before you begin, please set up Homebrew and Docker on your machine.


_42
# Setup Redpanda
_42
brew install redpanda-data/tap/redpanda
_42
rpk container start
_42
_42
# Setup Ockam
_42
brew install build-trust/ockam/ockam
_42
ockam enroll
_42
_42
# Setup Consumer
_42
ockam project ticket --usage-count 1 --expires-in 10m \
_42
--attribute consumer --relay consumer_relay > ticket
_42
_42
docker run --rm --network redpanda --name consumer \
_42
-v "$(pwd)/consumer.yaml:/connect.yaml" \
_42
-e OCKAM_ENROLLMENT_TICKET="$(cat ticket)" docker.redpanda.com/redpandadata/connect
_42
_42
# The previous command will block and print the output so you can see
_42
# the messages being consumed by the consumer.
_42
#
_42
# Run the following commands in a new terminal window.
_42
_42
# Setup Producers
_42
for i in $(seq 1 3); do
_42
ockam project ticket --usage-count 1 --expires-in 10m \
_42
--attribute producer > ticket
_42
_42
docker run --rm -d --network redpanda --name "producer$i" \
_42
-v "$(pwd)/producer.yaml:/connect.yaml" \
_42
-e OCKAM_ENROLLMENT_TICKET="$(cat ./ticket)" docker.redpanda.com/redpandadata/connect
_42
done
_42
_42
# It can take a minute or so for the messages to start flowing.
_42
# Observe that the above consumer can decrypt and transform messages.
_42
_42
# However, if you look at the messages inside redpanda console at
_42
# localhost:8080 or using the below rpk command, you'll notice that
_42
# the messages are encrypted.
_42
rpk topic consume topic_A
_42
_42
# Cleanup
_42
docker stop $(docker ps -q --filter "name=producer" --filter "name=consumer")
_42
rpk container purge

Detailed explanation of each step

Walking through the commands one-by-one will explain what each step achieves. Starting with getting Redpanda running locally:


_10
brew install redpanda-data/tap/redpanda
_10
rpk container start

These two commands install Redpanda and start a container. Now there is a broker running in a container that we will use to stream and process data with Redpanda Connect.


_10
brew install build-trust/ockam/ockam
_10
ockam enroll

The brew command is how you install Ockam Command into your local environment. ockam enroll will open a browser window and guide you through either signing up or signing in to Ockam Orchestrator. Ockam Orchestrator will allow you to securely manage the enrollment of thousands or more processing nodes without the need to manually push identities, keys, or credentials to each node. This initial enroll will establish your local machine as an Administrator within the project that has been setup for you within Ockam Orchestrator.


_10
ockam project ticket --usage-count 1 --expires-in 10m \
_10
--attribute consumer --relay consumer_relay > ticket

The ockam project ticket command is how you generate a ticket that will admit another node into your project. The arguments here specify that:

  • The ticket is valid for use 1 time. This ensures that the ticket is for the desired purposes and if it's inadvertently leaked there is no opportunity for an attacker to re-use it to gain access to your project.
  • A time-to-live (TTL)/expiry period of 10 minutes. This also reduces the risk posed by a ticket leaking, you can adjust it to any time window that's appropriate for your use case.
  • An attribute of consumer. This will add an attribute to any node that enrolls with this ticket, and policy definitions can then use those attributes to apply Attribute Based Access Controls and restrict which nodes can communicate with each other.
  • Allow any node, that enrolls with this ticket, to setup a relay named consumer_relay. This provides a named route for other nodes to find this node, even if both nodes are on different network.

The output of that command, the enrollment ticket, is then redirected out to a file named ticket.


_10
docker run --rm --network redpanda --name consumer \
_10
-v "$(pwd)/consumer.yaml:/connect.yaml" \
_10
-e OCKAM_ENROLLMENT_TICKET="$(cat ticket)" docker.redpanda.com/redpandadata/connect

This is where you start your consumer with Redpanda Connect. It uses Docker to start a new container named consumer, on an isolated network named redpanda where the broker is available, passing in the consumer.yaml and ticket files.

The consumer.yaml file defines how to receive the data:


_10
input:
_10
ockam_kafka:
_10
kafka:
_10
seed_brokers: [rp-node-0:9092]
_10
topics: [topic_A]
_10
consumer_group: example_group
_10
allow_producer: producer
_10
relay: consumer_relay
_10
enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}

This input block shows the standard kafka configuration options under the kafka element, these specify the broker, topics, and consumer group to use. The following three lines are the Ockam-specific additions that allow this input source to receive and process end-to-end encrypted data streams:

  • allow_producer: the attribute that needs to exist on the other node before allowing each node to establish a trusted channel between each other.
  • relay: the name of the relay to connect this node to.
  • enrollment_ticket: the enrollment ticket to use to admit this node into the Ockam Orchestrator project.

_10
pipeline:
_10
processors:
_10
- bloblang: |
_10
root = this
_10
root.data.message = this.data.message.uppercase()

The pipeline and processors block does a basic transformation of received data, and converts the message attribute to be all in upper-case. While not an entirely useful transformation in production, this allows you to independently verify that the consumer was able to decrypt and process the data.


_10
output:
_10
stdout: {}
_10
# snowflake_put:
_10
# account: acme
_10
# ...

The last part of the consumer.yaml configuration is the output block. This example implements a simple stdout output. As the commented-out Snowflake block highlights, other destinations can receive your final processed data stream by providing their configuration settings.

The code snippet for starting data producers shows how Ockam makes enrolling any number of nodes painless:


_10
for i in $(seq 1 3); do
_10
ockam project ticket --usage-count 1 --expires-in 10m \
_10
--attribute producer > ticket
_10
_10
docker run --rm -d --network redpanda --name "producer$i" \
_10
-v "$(pwd)/producer.yaml:/connect.yaml" \
_10
-e OCKAM_ENROLLMENT_TICKET="$(cat ./ticket)" docker.redpanda.com/redpandadata/connect
_10
done

The for i in $(seq 1 3); do line creates a for loop that will create 3 separate producers. Each prodcucer will generate a unique identity, enroll into the Ockam Orchestrator project, and establish a secure channel with the consumer node. If you feel ambitious you can increase that 3 to a higher number, just be aware the script will run a new docker container for each producer so you could quickly exhaust your local system resources.

The ockam project ticket command generates an enrollment ticket, just as it did in the earlier section that explained the setup of the consumer. The primary difference with this ticket is that the attribute is set to producer and there is no need to setup a relay for a producer.

The docker run command runs the same image as before, on the same network, giving each image a unique name with a prefix of producer, and passing in the producer.yaml and ticket for each node.

The producer.yaml configuration file has two blocks of importance:


_13
input:
_13
generate:
_13
count: 1000
_13
interval: "@every 1s"
_13
mapping: |
_13
root = {
_13
"_producer": hostname(),
_13
"data": { "email": fake("email"), "message": fake("sentence") }
_13
}
_13
# mongodb:
_13
# url: mongodb://localhost
_13
# database: orders
_13
# ...

The input block here is generating random data, useful for the sake of a self- contained example but something you would replace in a real-world use case. On a 1 second interval new data that includes the hostname (allowing you to verify that there is 3 separate nodes producing data), and fake data for the email and message attributes in the payload, is sent to the output. If you recall the explanation earlier, the message field is later processed by the consumer to be entirely in upper-case before being output to stdout.

For a real-world use case, the commented out block gives an example of how to use a Mongo database as the input source for a producer.


_10
output:
_10
ockam_kafka:
_10
kafka:
_10
seed_brokers: [rp-node-0:9092]
_10
topic: topic_A
_10
route_to_consumer: /project/default/service/forward_to_consumer_relay/secure/api
_10
allow_consumer: consumer
_10
enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}

The output block in producer.yaml contains the standard kafka configuration, along with three Ockam-specific options:

  • route_to_consumer: the full route to the relay that was setup for the consumer.
  • allow_consumer: the attribute that needs to exist on the other node before allowing each node to establish a trusted channel between each other.
  • enrollment_ticket: the enrollment ticket to use to admit this node into the Ockam Orchestrator project.

Now everything is setup and running. You'll see in the log output that the consumer is receiving data, from different hosts, and that the message in each payload is upper-case. If you open the Redpanda Console or run rpk topic consume topic_A you'll see that the encrypted messages in topic_A are not readable by the broker or anybody else that might have access to the topic.

By adding or changing a few lines in the consumer.yaml and producer.yaml files it's possible to:

  • Use an existing self-hosted Redpanda broker or Redpanda Cloud instance.
  • Add more inputs into producer.yaml and/or outputs consumer.yaml to connect, secure, and stream data between any of the hundreds of different integrations supported by Redpanda Connect.

Previous Article

How to grow an OSS community on GitHub

Next Article

Ockam Orchestrator GA: A new way for Enterprises to build apps that can Trust Data-in-Motion

Edit on Github

Build Trust

Learn

Get Started

Ockam Command

Programming Libraries

Cryptographic & Messaging Protocols

Documentation

Blog

© 2024 Ockam.io All Rights Reserved