Jump to content
  • Connecting Spotfire® to a Kerberized Data Source


    Spotfire® connects to a variety of data sources.  Sometimes those data sources require Kerberos authentication in order to get access to the data.  This article discusses connecting to Kerberized data sources from Spotfire®.  The various use cases are discussed with a focus on the use case in which one is using a service account to connect from Spotfire® Server to a Kerberized data source

    Introduction

    Spotfire® connects to a variety of data sources.  Sometimes those data sources require Kerberos authentication in order to get access to the data.  This article discusses connecting to Kerberized data sources from Spotfire®.  The various use cases are discussed with a focus on the use case in which one is using a service account to connect from Spotfire® Server to a Kerberized data source.  This use case is commonly seen with Hadoop data sources in which the Hadoop cluster is Kerberized by a different Kerberos server than the user environment.

    Spotfire® can connect to data sources that require Kerberos authentication.  Kerberos is an authentication protocol developed by MIT.  One can read more details about the protocol at the MIT Kerberos website.  This document will not attempt to explain Kerberos, but its main features of it require secure tickets that are exchanged between users, the Kerberos Key Distribution Center (KDC), and Kerberos-secured applications.  Strong cryptography is used to encrypt and secure communications between components.

    Kerberos is a very secure protocol that requires that all the pieces are configured and set up correctly in order for authentication to succeed.  Since all the pieces must be properly configured in order to exchange tickets, one can easily miss or incorrectly configure the components which can lead to errors and issues.

    This document focuses on a specific use case in which the Spotfire® environment is connecting to a Kerberos-enabled data source via JDBC from Spotfire® Server.  While this document uses a Hive database environment on Hortonworks 2.5 as the Kerberos data source, the same steps on the Spotfire® Server can be adapted to work with other JDBC data sources.

    kerberized_data_source.thumb.png.cdbc8f4827bc2b52c6610cef25ce5699.png

    Figure 1 Spotfire® Connectivity to a Kerberized Hadoop Cluster

    As seen in the diagram, there are two main ways that users can get data from external data sources into Spotfire®:

    1. Via Spotfire® Connectors (pink connection in the diagram) - Spotfire® Connectors are database-specific connectors that connect directly from the client (Analyst, Web Player, Automation Services) to the data source.  The Spotfire® Connectors use native database connectivity to connect to the database.  Most of the Spotfire® Connectors allow one to "Keep data external" or "Import data."  Keeping the data external runs the queries on the database and only returns the minimum amount of data needed for visualization and analysis.
    2. Via Spotfire® Information Services (red connection in the diagram) - Spotfire® Information Services uses JDBC connections from the Spotfire® Server to data sources that have JDBC drivers.  For Spotfire® Information Services, the data connections are from the Spotfire® Servers and, typically, use connection pooling.  The data is streamed through the Spotfire Server to the clients.

    In order to connect to a data source with Kerberos authentication using a Spotfire® Connector, the Spotfire® Connector must support Kerberos as an option.  For Spotfire® Information Services, the JDBC driver for the data source needs to support Kerberos.  The use case in this document focuses on connectivity between Spotfire® Information Services and a Kerberized data source in which the Spotfire® environment is not running Kerberos or the connectivity to the data source is dependent on a service account rather than the current user.

    Use Cases for Kerberized Data Sources

    Spotfire® supports an almost unlimited set of possible configurations.  For the purposes of this document, we have focused on the following major use cases:

    I want to... Available tools
    Connect to a Data Source from the Spotfire ®Analyst and Web Player using a Spotfire® Connector and pass the current user credentials without the user having to provide them. The Spotfire® environment must be configured with delegated Kerberos authentication.  This will allow the users' Kerberos tokens to be used on the Web Player nodes.  The Spotfire® Data Connector for the data source must also support Kerberos authentication to the data source.

    Connect to a Data Source using JDBC from Spotfire® Server and pass the current user credentials without the user having to provide them.

    The Spotfire® environment must be configured with delegated Kerberos authentication.  This will allow the users' Kerberos tokens to be communicated throughout the environment and used on Spotfire® Server to authenticate with the JDBC data source.
    Connect to a Data Source with Kerberos Authentication using JDBC from the Spotfire® Server and Information Services and use a service account instead of the current user credentials. The Spotfire® environment can use any type of authentication since the Kerberos credentials to the data source will be coming from a service account and originate on the Spotfire® Server.

    The use cases assume that the data source supports single-sign-on via Kerberos, and, that for the first use case, the Spotfire® Data Connector for the data source supports Kerberos connectivity to the data source. 

    The first two use cases are covered fairly well in the Spotfire® Server documentation which tells how to configure Kerberos authentication for the Spotfire® environment.  This document focuses on the last use case in which service account credentials are used with only a brief discussion of the first two use cases.

    Connecting to a Kerberized Data Source from a Spotfire® Client

    Users can use the Spotfire® Data Connectors to connect directly to a data source from Spotfire® Analyst, Spotfire® Web Player (Spotfire® Consumer and Spotfire® Business Author), and Spotfire® Automation Services.   Support for Kerberos authentication will depend on the Spotfire® Data Connector version and the driver used for the connector.  The Spotfire® Data Connectors System Requirements page lists what authentication methods are supported with which versions for all the Spotfire® Data Connectors.

    Assuming one is using the currently logged-in user, connecting to a Kerberized Data Source from Spotfire® Analyst with a Spotfire® Connector usually is straightforward.  One just needs to enter the correct information in the Spotfire® Connector that is required for Kerberos, e.g. Kerberos realm, the host fully qualified domain name (FQDN), and service name.

    One can use a service account to connect from the Spotfire® Analyst to a Kerberized Data Source.  This use case is not discussed in detail here.  One would need to acquire a local Kerberos ticket for the service account and then connect to the data source.

    To connect from the Spotfire® Web Player to a Kerberized data source using the Spotfire® Connectors and the current user identity, the Spotfire® environment has to be configured for Kerberos so that the user Kerberos credentials can be delegated through the environment and to the data source.   The steps to configure the Spotfire® environment for Kerberos authentication are discussed in the Spotfire® Server Installation and Administration Manual - Setting up Kerberos authentication section. 

    Connecting to a Kerberized Data Source from Spotfire® Server

    Spotfire® Server connects to data sources via JDBC and the Spotfire® Server Information Services component.  As discussed in the use cases table, the two main use cases in which one connects from Spotfire® Server to Kerberized data sources are when the current user credentials need to be passed to the Kerberized data source and when a service account is used.

    Using current user credentials

    In order to pass the current user credentials through the Spotfire® environment and support single-sign-on, the user's Kerberos credentials need to be able to be delegated from the user to the JDBC data source.  This requires that the Spotfire ®environment be configured to support Kerberos authentication.  As mentioned above, this is discussed in the Spotfire® Server Installation and Administration Manual - Setting up Kerberos authentication section. 

    Once the environment supports Kerberos authentication, the data source must be configured to support it as well.  The Spotfire® Server Manual Data Source Template section has a few sections that discuss some of the changes required to the data source templates.  Some of the information in the next section will be helpful, but the user credentials will come via the Kerberos configuration in the Spotfire® stack instead of from a service account. 

    This section will likely be built out in the future, but for now, the focus is on the next use case.

    Using a service account

    This use case focuses on how to connect to a Hortonworks (HDP) 2.5 Hive database but can be broadened to work with other JDBC data sources.   This information assumes that one is running Spotfire® Server version 10.3 or greater.  The Spotfire Community Wiki article Spotfire® JDBC Data Access Connectivity Details discusses the Spotfire® Data Source Template needed and the specific jar files from Hortonworks for a non-Kerberized cluster.

    Now to connect to the Kerberized cluster.  Again, this use case assumes that one is using a service account to connect to the Hortonworks cluster.  Another wrinkle that often comes into play with a Kerberized Hadoop cluster is that the cluster is secured with a different KDC than the environment (possibly Windows) where the Spotfire® Server is running.  This means that the Spotfire® Server cannot run as a Windows service account whose credentials would then be used to connect to the Kerberized cluster.   This example does use an MIT Kerberos environment on the HDP cluster.

    The solution I will describe is using a service account from Spotfire® Server to the Kerberized HDP cluster.  This would be the case if one does not need to pass the current user credentials to the HDP cluster but just wants to connect from the Spotfire® Server to the Kerberized HDP cluster using a service account.

    There are several tasks to do and configure on the Spotfire® Server:

    1. Copy jar files from HDP cluster to Spotfire® Server (Starting in Spotfire® Server 10.3 jars should be copied to <install directory>/tomcat/custom-ext)
    2. Configure the krb5.conf file in <install directory>/tomcat/spotfire-config (Prior to Spotfire® Server 10.3, location is <install directory>/jdk/jre/lib/security/krb5.conf)
    3. Create keytab or credentials cache for service account or use username and password in Information Services data source to dynamically obtain Kerberos ticket
    4. Add a Data Source Template to work with Kerberos
    5. Configure the JAAS Configuration for the environment
    6. Save the configuration and restart the Spotfire® Server

    These steps and use cases were tested and completed on the HDP 2.5 sandbox environment.  The details for each step are below.

    In order to test these steps, I created an MIT KDC in the HDP 2.5 environment.  I used the instructions on the Hortonworks website in order to set up the MIT KDC and configure Kerberos in Hortonworks using the automated wizard.  It is typically helpful if someone is familiar with configuring Kerberos in the customer environment as these instructions may gloss over some of the Kerberos details since the focus is on a specific use case. 

    For example, I set up a user called 'testuser' in the MIT KDC on the HDP 2.5 environment. 

    Copy jar files from HDP to Spotfire® Server

    The jar files were copied from the HDP environment to Spotfire® Server - <install directory>/tomcat/webapps/spotfire/WEB-INF/lib (Starting in Spotfire Server 10.3, the directory is <install directory>/tomcat/custom-ext).  The jar files may be different between versions of HDP and Spotfire®.  The Spotfire Community Wiki article Spotfire® JDBC Data Access Connectivity Details may be more updated regarding the specific jar files needed.  For this test and example, the following files were copied:

    • Copied from /usr/hdp/2.5.0.0-1245/hive/lib
      • hive-exec.jar - copied from
      • hive-jdbc-<version>.jar - (specifically hive-jdbc-2.1.0.2.5.0.0-1245.jar)
      • hive-service.jar
      • hive-metastore.jar
      • zookeeper-3.4.6.2.5.0.0-1245.jar  (This jar is needed when one is connecting to Hiveserver2 through a ZooKeeper Quorum, e.g. usering service Discovery.) 
    • Copied from /usr/hdp/2.5.0.0-1245/hadoop
      • hadoop-common.jar
      • hadoop-auth.jar
    • Copied from /usr/hdp/2.5.0.0-1245/hadoop/lib
      • commons-configuration-1.6.jar

    The above jars are all that is needed if one is using the binary transport mode (the default).  If one is using the http transport mode, then the following additional jar files are needed:

    • Copied from /usr/hdp/2.5.0.0-1245/hive/lib
      • curator-client-2.7.1.jar
      • curator-framework-2.7.1.jar

    In previous versions of Spotfire® Server, the commons-collections-<version>.jar and slf4j-api-<version>.jar needed to be copied.  These jars are now included with Spotfire® Server so one does not need to copy them.

    Note that the hive-jdbc.jar file cannot be the hive-jdbc-standalone.jar file as that will conflict with jar files that Spotfire® Server provides.  During different tests connecting various Spotfire® versions to Hive via JDBC, at times, the jar files required have changed slightly such that one needs to test which jars are needed.   One does not and typically should not copy over all jar files to Spotfire® Server as some of the jars already exist in Spotfire® Server since it uses Tomcat and a Java based web application.

    Edit the krb5.conf file

    The krb5.conf file in the Spotfire® Server (<install directory>/tomcat/spotfire-config) needs to be edited to provide the Kerberos realm and configuration information.  Here is an example, obviously replace EXAMPLE.COM and the kdc and admin_server machine names with the correct values for your environment:

    [libdefaults]
      default_realm = EXAMPLE.COM
      default_tkt_enctypes = rc4-hmac
      default_tgs_enctypes = rc4-hmac
    
    [realms]
      EXAMPLE.COM = {
        admin_server = sandbox.hortonworks.com
        kdc = sandbox.hortonworks.com
      }
     

    If one needs to test Kerberos using Java commands like kinit, klist, etc., then the krb5.conf file needs to be included in the jdk security directory of the Java implementation one is using to test Kerberos or the location of the krb5.conf file needs to be indicated.

    Create Keytab file or Credentials Cache for service account

    In order for the service account to log into the HDP cluster from the Spotfire® Server machine, the Spotfire® Server needs to have credentials either in a keytab file or created in a credentials cache or provided in the username and password of the Information Services data source.  If one is using username and password in the Information Services data source, then this section can be skipped. 

    NOTE: In testing, if one is using the http transport mode, then one MUST use a keytab file or a credentials cache to connect.  The username/password method will not work.

    The credentials cache usually expires such that one would need to setup a scheduled job to run and renew the credentials cache before the Kerberos ticket expires.  The expiration time is typically 24 hours but can be configured in the Kerberos system.  To obtain a key tab, one needs to create the keytab file for the service account and copy it to the Spotfire® Server.  For this example, I used the kadmin tool in my test MIT Kerberos environment to create a keytab for the testuser with this command:

     ktadd -k /etc/testuser.keytab testuser@EXAMPLE.COM
     

    I then copied the testuser.keytab file to my Spotfire® Server machine into the <install directory>\jdk\jre\lib\security directory.  Note that the keytab file can go anywhere that is accessible by the Spotfire® Server service.  Starting in Spofire Server 10.3, it is recommended that the keytab file be copied to the <install directory>\tomcat\spotfire-config directory.  In this example, once I ran ktadd in the kadmin tool to create the keytab file, the testuser password was changed in the Kerberos database.  This is by design to provide extra security.

    If one wants to use a credentials cache rather than a keytab file, then one can use the kinit tool that is bundled with the Spotfire® Server JDK.  Open a command-line window and change the directory into <install directory>\jdk\bin.  Then run the following which will prompt for the testuser password:

     .\kinit -c c:/temp/krb5java_ccache testuser@EXAMPLE.COM
     

    This command must be run with the same JDK that has the krb5.conf edits as it uses the information from the krb5.conf file to find the KDC. 

    One can use klist to see what is in the keytab file or the credentials cache (assuming one is in the <install directory>\jdk\bin):

    For listing keys in the keytab file:

     .\klist -k c:\tibco\tss\10.3.2\jdk\jre\lib\security\testuser.keytab
     

    For listing keys in the credentials cache:

     .\klist -c -e c:/temp/krb5java_ccache
     

    The klist command can be used to help debug issues by making sure what is in these files is what is expected.

    Modify/Create the Hortonworks template with Kerberos property

    Spotfire® Server Information Services uses data source templates to describe the properties of the database, e.g. JDBC driver class, URL connection string, naming patterns, etc.  The data source template below needs to be added to the Data Source Templates for the Spotfire® Server configuration.  This can be done via the command line 'config add-ds-template' or via the Spotfire® Server Configuration Tool.  This is a screenshot from the Spotfire® Server Configuration Tool with the Hortonworks_Kerberos data source template added and enabled.

    exampledstemplate.thumb.png.fb0c96e9e3dfd836876104381eb2b15c.png

    Figure 2 Data Source Template for Kerberos connection to Hortonworks

    One of the properties to set in the template when doing Kerberos is the spotfire.kerberos.login.context.  This property tells Spotfire® Server which JAAS (Java Authentication and Authorization Service) configuration to use for the Kerberos properties.  Here is the entire data source template with the added connection property for the spotfire.kerberos.login.context and note the value of the spotfire.kerberos.login.context property is HDPKerberos:

    <jdbc-type-settings>
      <type-name>hortonworks_hive_kerberos</type-name>
      <driver>org.apache.hive.jdbc.HiveDriver</driver>
      <connection-url-pattern>jdbc:hive2://&lt;host&gt;:&lt;port10000&gt;/&lt;database&gt;</connection-url-pattern>
      <supports-catalogs>false</supports-catalogs>
      <supports-schemas>true</supports-schemas>
      <supports-procedures>false</supports-procedures>
      <ping-command>SHOW TABLES</ping-command>
        <column-name-pattern>`$$name$$`</column-name-pattern>
        <table-name-pattern>`$$name$$`</table-name-pattern>
        <schema-name-pattern>`$$name$$`</schema-name-pattern>
        <catalog-name-pattern>`$$name$$`</catalog-name-pattern>
        <procedure-name-pattern>`$$name$$`</procedure-name-pattern>
        <column-alias-pattern>`$$name$$`</column-alias-pattern>
      <connection-properties>
        <connection-property>
          <key>spotfire.kerberos.login.context</key>
          <value>HDPKerberos</value>
        </connection-property>
      </connection-properties>
    </jdbc-type-settings>
    To add the data source template as seen in the screenshot above, one clicks the 'New' button on the bottom of the Data Source Template screen; copies the text above, enters the name (e.g. Hortonworks_Kerberos), clicks Ok to save the new template, then check the "Enabled" check box to enable the new template.

    NOTE: If the ticket has a lifetime, you will need to add this connection property to the data source template.  The default value is false.

    <connection-property>
      <key>spotfire.kerberos.refresh.tgt</key>
      <value>true</value>
    </connection-property>
    Make sure to place this extra connection property between the <connection-properties> start tag and </connection-properties> end tag.

    Setup the JAAS Conf that links to the HDPKerberos property

    In the Spotfire® Server Configuration Tool, select Custom JAAS. Then create and configure the HDPKerberos JAAS entry.  Note that the name of the JAAS entry has to match the spotfire.kerberos.login.context value in the data source template.  Following are two examples JAAS configurations depending on if one is using a keytab file or a Kerberos ticket cache.

    JAAS Configuration when using key tab file:

    HDPKerberos {
        com.sun.security.auth.module.Krb5LoginModule required
        useTicketCache=false
        debug=true
        useKeyTab=true
        keyTab=C:/tibco/tss/7.7.0/jdk/jre/lib/security/testuser.keytab
    }
     
    JAAS Configuration when using ticket cache:
    HDPKerberos {
        com.sun.security.auth.module.Krb5LoginModule required
        ticketCache=c:/temp/krb5java_ccache
        useTicketCache=true
        debug=true
        useKeyTab=false
    }
     

    JAAS Configuration when specifying username and password during data source creation in Information Designer:

    HDPKerberos {
        com.sun.security.auth.module.Krb5LoginModule required
        useTicketCache=false
        debug=true
        useKeyTab=false
    }
     

    Paths in the JAAS Configurations should be modified for your environment and will need to be modified if using Spotfire® Server version 10.3.  The following screenshots show how one can add a new Custom JAAS.

    jaasinitial.png.13d4223019981edb1bb130b7151d65fd.png

    Figure 3 Adding new Custom JAAS configuration with name HDPKerberos

    jaasexample.png.47cc9eb5a618f60e02a0c6fe9cdd4649.png

    Figure 4 Created JAAS Configuration that uses a keytab file

    After making all the changes in the Spotfire® Server configuration, save the configuration and restart the Spotfire® Server.

    Additional Spotfire® Server Configuration Needed for HTTP Transport Mode

    If one is using the http transport mode, then additional configuration is needed with Spotfire® Server. Before editing tomcat/bin/service.bat make a backup.  One needs to add the following JVM option to the JvmOptions line in tomcat/bin/service.bat (or service.sh):

     -Djavax.security.auth.useSubjectCredsOnly=false
     

    The above setting should be added to the end of the JvmOptions as seen in this example:

     --JvmOptions "-Dcatalina.home=%CATALINA_HOME%;-Dcatalina.base=%CATALINA_BASE%;-D%ENDORSED_PROP%=%CATALINA_HOME%\endorsed;-Djava.io.tmpdir=%CATALINA_BASE%\temp;-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager;-Djava.util.logging.config.file=%CATALINA_BASE%\conf\logging.properties;-XX:+AlwaysPreTouch;-XX:+UseG1GC;-XX:+ScavengeBeforeFullGC;-XX:+DisableExplicitGC;-Dcom.sun.management.jmxremote;-Dorg.apache.catalina.session.StandardSession.ACTIVITY_CHECK=true;-DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector;-Djavax.security.auth.useSubjectCredsOnly=false" ^ 
     

    On Windows, after this has been done, one needs to update the Windows service if Spotfire® Server is running as a service.  The entire procedure would be:

    1. Stop the Spotfire® Server service.
    2. Open a Windows Command line window as an Administrator, and go to the <installation dir>/tomcat/bin directory.
    3. Enter the following command: service.bat remove.
    4. Edit tomcat/bin/service.bat.
    5. Add the additional JVM Option as noted above and save and close the file.
    6. Enter the following command: service.bat install.
    7. Start the Spotfire® Server service.

    This will modify the configuration setting for the Spotfire® Server which is needed if one is using JDBC to connect to a Hive database using the http transport mode.

    Creating the Data Source in Information Designer

    The Hortonworks_Kerberos data source should be available from the Information Designer Data Source screen after the Spotfire® Server is restarted.  The last piece is to get the data source configured correctly in Information Designer.   Start Spotfire® Analyst and log in with a user that has access to the Information Designer.  Choose Tools->Information Designer and Setup Data Source.

    Here is an example URL that has worked for me and my customers:

     jdbc:hive2://sandbox.hortonworks.com:10000/default;principal=hive/sandbox.hortonworks.com@EXAMPLE.COM;auth=kerberos;kerberosAuthType=fromSubject
     

    The other item to note is that the username for the data source in Spotfire® needs to match the username of the user in the keytab or ticket cache.  When using the keytab or ticket cache, the password is ignored, thus any text can be entered for the password.  If one is using the JAAS Configuration with useKeyTab and useTicketCache set to false, then the username and password for the service account must be specified when creating the data source in information designer. Spotfire® Server will then request a Kerberos ticket for the service account using the username and password instead of the keytab file. 

    informationservicesdatasource.thumb.png.16f886cd660bfd268d931a86e3847e0c.png

    Figure 5 Created Kerberos data source with example connection URL

    In some customer implementations (non-sandbox), other parameters may be needed based on how the HDP cluster is configured.  Some other common parameters on the connection URL are: transportMode, ssl, httpPath (if transportMode is http).  The above example connection URL is all that has been critical for Kerberos.  If SSL is used, then the SSL certificate may need to be imported into the cacerts Java file in <install directory>/jdk/jre/lib/security.  Instructions for importing a certificate into the cacerts files are documented in the Spotfire® Server Manual in the Configuring LDAPS section.

    Other connection URL examples:

    Note that for the HTTP transport mode connection URLs, the auth=kerberos and kerberosAuthType=fromSubject settings are not needed.

    • Kerberos using transport mode binary:
      • Hive direct: 
         jdbc:hive2://sandbox.hortonworks.com:10000/default;principal=hive/sandbox.hortonworks.com@EXAMPLE.COM;auth=kerberos;kerberosAuthType=fromSubject
         
      • Hive via ZooKeeper discovery:
         jdbc:hive2://sandbox.hortonworks.com:2181/default;principal=hive/sandbox.hortonworks.com@EXAMPLE.COM;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;auth=kerberos;kerberosAuthType=fromSubject
         
    • Kerberos using transport mode http:
      • Hive direct: 
         jdbc:hive2://sandbox.hortonworks.com:10001/default;principal=hive/sandbox.hortonworks.com@EXAMPLE.COM;transportMode=http;httpPath=cliservice;
         
      • Hive via ZooKeeper discovery: 
         jdbc:hive2://sandbox.hortonworks.com:2181/default;principal=hive/sandbox.hortonworks.com@EXAMPLE.COM;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;transportMode=http;httpPath=cliservice;
         

    Debugging any issues with Kerberos and Spotfire® Server

    Often times due to its high-level of security, Kerberos can be hard to get all the components correctly configured.   Usually, if one can find the right debug information, the troubleshooting becomes a lot easier.   There are a couple of places to set debug flags for Kerberos:

    1. The first debug setting was set in the JAAS configuration above where the property debug was set to true.
    2. In setenv.bat/setenv.sh, add: set HADOOP_OPTS=-Dsun.security.krb5.debug=true
    3. In service.bat, add to the JvmOptions line: -Dsun.security.krb5.debug=true
      1. For Linux, add this to the setenv.sh JAVA_OPTS line.

    These can give very low-level debugging information and should only be turned on when troubleshooting an issue.

    Other Troubleshooting Information

    When troubleshooting issues with Kerberos and database connectivity, the key piece is to break the problem down to help determine where the issue is.   I would recommend first connecting from the simplest environment and method, if possible.  In my testing, I was using a sandbox environment which allowed me access to the Hadoop machine.  This may not always be possible.  Even so, my first recommendation is to get the Hive beeline command-line tool working first.  This can be used as a smoke test to verify that connectivity is possible.  

    With the help of the Internet, I was able to determine what jars and command-line options were needed from Windows to get beeline to work.  After copying over all the jars listed in the classpath, I used the following command line from a Windows machine in the directory with all the jar files:

     <path to java>\java -Xmx1024m -classpath apache-log4j-extras-1.2.17.jar;avatica-1.8.0.2.5.0.0-1245.jar;calcite-core-1.2.0.2.5.0.0-1245.jar;calcite-linq4j-1.2.0.2.5.0.0-1245.jar;commons-cli-1.2.jar;commons-codec-1.4.jar;commons-collections-3.2.2.jar;commons-configuration-1.6.jar;commons-lang-2.6.jar;commons-logging-1.1.3.jar;curator-client-2.6.0.jar;curator-framework-2.6.0.jar;derby-10.10.2.0.jar;guava-14.0.1.jar;hadoop-annotations-2.7.3.2.5.0.0-1245.jar;hadoop-auth-2.7.3.2.5.0.0-1245.jar;hadoop-common-2.7.3.2.5.0.0-1245.jar;hadoop-mapreduce-client-core-2.7.3.2.5.0.0-1245.jar;hive-beeline-1.2.1000.2.5.0.0-1245.jar;hive-exec-1.2.1000.2.5.0.0-1245.jar;hive-jdbc-1.2.1000.2.5.0.0-1245.jar;hive-jdbc-1.2.1000.2.5.0.0-1245-standalone.jar;jline-2.12.jar;log4j-1.2.17.jar;slf4j-log4j12-1.7.10.jar;super-csv-2.2.0.jar;xercesImpl-2.9.1.jar -Dhdp.version=2.5.0.0-1245 -Djava.net.preferIPv4Stack=true -Dhdp.version=2.5.0.0-1245  -Djava.net.preferIPv4Stack=true -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=<path to jaas conf>/beeline_jaas.conf  org.apache.hive.beeline.BeeLine
     

    For java, I used the Spotfire® Sever jdk, e.g. tibco/tss/10.3.2/jdk/bin/java.  Beeline works better with a credentials cache rather than a keytab.  If using a keytab, beeline asks multiple times for the keytab username.  This is the beeline_jaas.conf file referenced in the code example:

    com.sun.security.jgss.krb5.initiate {
        com.sun.security.auth.module.Krb5LoginModule required
        ticketCache="c:/temp/krb5java_ccache"
        useTicketCache=true
        debug=true
        useKeyTab=false;
    };
     

    As one can see, the credentials cache is used in this JAAS configuration.  The information for creating the credentials cache is in the "Create Keytab file or Credentials Cache for service account" section above.

    One also needs to be aware of which krb5.conf file will be used.   By default, the krb5.conf file will be the one in the jdk/jre/lib/security directory of the JDK used to run beeline.  The krb5.conf file should match the krb5.conf file used with Spotfire® Server.  To guarantee which krb5.conf file is used one can add this to the command line above:

     -Djava.security.krb5.conf=<path to krb5.conf file>/krb5.conf
     

    If one needs additional debugging information, then the following java options can be included on the command line:

     -Dsun.security.krb5.debug=true -Djava.security.debug=gssloginconfig,configfile,configparser,logincontext
     

    After connectivity is established using beeline, I recommend trying a third-party JDBC tool from the Spotfire® Server machine (or a machine close enough to it to simulate any network issues).  I did run into some issues with various tools, so one may need to try several before finding one that has the correct settings exposed and doesn't run into issues with jar files.  One will need to be able to configure similar settings to that one can set in Spotfire®.   Using a third-party tool helps one determine that the connection is possible and what the settings needed are.  

    In debugging, I also found a network analyzer tool useful since it allowed me to see the network traffic, especially regarding Kerberos and what was different between a successful connection and an unsuccessful one.  Oftentimes, the errors one receives from JDBC are not helpful and are too generic.


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...