Table of Contents
- Advanced Properties for Kafka
- Kerberos Configuration for Kafka Example
- Troubleshooting Using A Java Program
- Kafka Versions
- Authentication and Authorization Are Different Things
This article is a quick write-up to show how to configure the Kafka Consumer Adapter for Spotfire Streaming.
The instructions also apply to Spotfire Streaming, Spotfire Live Datamart, and Spotfire Data Streams, as all these products use the same Kafka adapters. In this article, I'll use the term Spotfire Streaming as the name of the product. As long as the underlying technology is based on Spotfire Streaming (as all of these are), the product name doesn't matter in terms of how we actually use the product.
These instructions also apply to the other Kafka-related adapters for these products, specifically the Kafka Consumer Commit Adapter, Kafka Producer Adapter, and Kafka Admin Adapter.
Advanced Properties for Kafka
The heart of configuring the Kafka Adapters to use Kerberos authentication is to understand that this will be accomplished using the adapter's Advanced Options > Advanced Properties.
There is another article on Spotfire Community that shows how to configure the Kafka Consumer Adapter for a remote, secured Kafka broker is How To Connect to Microsoft Azure Event Hubs using the TIBCO StreamBase® Input Adapter for Apache Kafka Consumer, and it would be good to have a read over of that, skimming over all the Azure stuff in the likely event your Kerberized Kafka broker is not on Azure. In particular, the Part 2: Configure the EventFlow modules for Azure Event Hubs section shows a little about how to configure the Kafka Consumer to externalize environment-sensitive configuration information as Advanced Properties to the adapter. In addition, the Studio project available on that page would be a good starting point to Kerberize your Kafka authentication -- it's a matter of editing the .conf files and the adapter properties to suit your own infrastructure, even though that article about the Azure EventHubs Kafka interface doesn't use Kerberos at all. Connecting to any Kafka environment would raise similar issues. You don't have to externalize your parameters using a .conf file this way just to establish connectivity, but once you start figuring out that you are going to be moving your application from dev to qa to sit to test to prod environments eventually, it's going to seem like a really good idea.
For those coming at this task from the angle of having generated a LiveView fragment to subscribe to a Kafka topic with the Streaming Studio Connectivity Wizard as described, for example, in How to Create a Data Stream from Apache Kafka® - SFDS 103, you'll be locating the Kafka Consumer Adapter instance in the generated MyProject_liveviewfragment project and changing some properties there, and perhaps adding an engine configuration file to that project as well. Look in MyProject_liveviewfragment > src/main/eventflow > com.tibco.tibco-streaming.kafkacwexample > myTopicProvider.sbapp for the ReadFromBus component, then bring up its Properties view and focus on the Advanced Options tab and scroll down to the Advanced Properties grid. You'll be entering things there for the Kerberos configuration.
Kerberos Configuration for Kafka Example
In the Kerberized Kafka world, there's no absolute standard way of setting up the connection properties to authenticate via Kerberos. People do things differently in different places. To give an example, there is actually a sort of canonical or exemplar way to specify the Kerberos-related properties for a Java API-based thing, so I'll show that way here. To summarize a whole lot of knowledge we don't have time to talk about here, you'd start with something like this:
sasl.jaas.config: com.sun.security.auth.module.Krb5LoginModule required useTicketCache=true doNotPrompt=true renewTicket=true serviceName=kafka useKeyTab=true keyTab="/path/to/keytab/file" principal="serviceAccount@FQDN"; security.protocol: SASL_PLAINTEXT sasl.mechanism: GSSAPI sasl.kerberos.kinit.cmd: /path/to/kinit/in/jdk/bin sasl.kerberos.service.name: kafka (could be something else)
with each line of the snippet above being a separate Advanced Property for the adapter instance.
We're still figuring this all out, but sometimes we need to use single quotes instead of double quotes for the keytab and principal parts of the sasl.jaas.config property. You may have to experiment. Similarly, on Windows, you may have to use double-backslashes instead or forward slashes in path names, and it might be different for the keytab than it is for the kinit.cmd since different code interprets those paths. Apologies, but there are a lot of layers of abstraction here and not all of them are entirely within our control, at least if we're going to expose the generic Kafka property way of doing things that work for lots of different kinds of connectivity environments.
We've seen this canonical way work for some people. But it depends on how your organization has set up Kerberos (and Kafka) in your environment. The properties needed could vary a lot; we've also seen that, too. We've also seen some enviroments that need to establish Java system properties or environment variables to connect successfully to Kafka, which is all quite doable but there are a lot of possible combinations. The point here is: you have to know what properties and files you need for your environment to successfully connect any Java client to that Kerberized Kafka broker. The important thing to remember here is that TIBCO Streaming's Kafka adapters are all using the plain vanilla Apache Kafka core client APIs for Java to do what they do and we have to tell Spotfire Streaming all the same things we'd have to tell any Java program.
Troubleshooting Using A Java Program
If you already have a working Java program that successfully authenticates to your broker, by all means, mimic what that's doing. On the Azure Event Hubs Kafka wiki page I've linked to above, the example project there has some Kafka Java programs you can use to experiment with if you don't already have something suitable at your disposal. It can be very a useful tactic to first debug and establish connectivity with a little standalone Java program running on the same machine as you're going to be running Spotfire Streaming on, and then transfer the working property settings to the StreamBase .sbapp. This tactic is helpful so that you don't have to even explain what this is to your Kafka admins -- they probably already know how to talk to Java developers -- and also because it might be easier to see debug output or stack traces from the little Java program than it is when you stand up a Spotfire Streaming cluster under Studio control, and also probably faster to iterate on various configuration options as well. (This advice really goes for any kind of connectivity establishment work that's at all fiddly, not just Kerberized Kafka. It's just basic divide and conquer.)
Kafka Versions
Another issue that comes up when trying to establish Kafka connections to any arbitrary Kafka broker out there in the world is Kafka versions. Spotfire Streaming's Kafka Adapters reference a specific version of the Kafka Java Client, and does so via Maven. The version of the Kafka client library used is typically updated from time time with new releases of Spotfire Streaming. We document the version on our Supported Configurations documentation page. Just look for Apache Kafka components. However, it's possible that page it out of sync with what's actually shipped (as it is in TIBCO Streaming 10.5.0, sorry!). The prudent way to find out what Kafka client version we're using is to (once you've at least loaded the product's Kafka sample into Studio so that we know the Spotfire Streaming Maven artifacts are populated into your local Maven repository). However you like to browse your Maven repository, go look for the artifact groupId: com.tibco.ep.sb.adapter artifactId: kafka. The version will match the version of Spotfire Streaming, such as 10.5.0. Look for the dependency on com.apache.kafka kafka-clients and note the version. That's the version of the Kafka Client. You need to know that this version of the Kafka client will work with your Kafka broker. You should be fine, in general, if the Kafka Broker is 0.10.0 or above, but any given version of the Kafka broker may not support some specific feature of Kafka you need, so it's important information to talk over with your Kafka administrators if there is trouble establishing connectivity. Also, if debugging your connectivity with a standalone Java program as advised above, it's a really, really, really good idea to use the same version of the Kafka client that Spotfire Streaming is using.
Authentication and Authorization Are Different Things
It's easy to lose sight of this when trying to establish connectivity, but Kerberos is just an authentication mechanism and being able to successfully authenticate your principal to your broker doesn't help you at all when it comes to having authorization to use the underlying resources. For example, if you want to consume Kafka messages from a Kafka topic named myTopic, somewhere someone will have had to grant your principal read access, at least, to that topic. How that authorization happens is another one of those things that is highly variable depending on your Kafka infrastructure. Your Kafka administrators will know, but it's too big a topic to cover here. Some examples might be to use the kafka-acls.sh command line tool, or via Ambari or Ranger.
But the reason I bring this up here is: read your error messages closely. Authentication and authorization failures yield different messages from the Kafka client and broker (and even Spotfire Streaming). You might even be talking to different admin teams for these different kinds of security configuration issues. It's easy to confuse one kind of failure for the other, especially as security error messages are sometimes deliberately obscure.
Recommended Comments
There are no comments to display.