Topics: |
How to: |
Reference: |
Configuring the adapter consists of specifying connection and authentication information for each of the connections you want to establish. The adapter supports Hiveserver2, which is available in Hive 0.11.0 and later. It is strongly recommended for improved throughput, functionality, and security.
You can configure the adapter from either the Web Console or the Data Management Console.
or
From the Data Management Console, expand the Adapters folder.
In the DMC, the Adapters folder opens. In the Web Console, the Adapters page opens showing your configured adapters.
Driver |
JDBC Driver Name |
---|---|
Apache Hive |
org.apache.hive.jdbc.HiveDriver |
Cloudera/Simba |
com.cloudera.hive.jdbc41.HS2Driver |
The following image shows an example of the configuration settings used:
The Adapter for Hadoop/Hive is under the SQL group folder.
Logical name used to identify this particular set of connection attributes. The default is CON01.
Is the URL to the location of the data source.
The URL used depends on what type of server you are connecting to. See the table below for examples. For more details, see the JDBC section in the online documentation for Apache Hive, or the Cloudera JDBC Driver for Apache Hive documentation, which is included with the JDBC driver download.
Server |
URL |
---|---|
Hiveserver2 |
jdbc:hive2://server:10000/default |
Kerberos Hiveserver2 (static) |
jdbc:hive2://server:10000/default; principal=hive/server@REALM.COM |
Kerberos Hiveserver2 |
jdbc:hive2://server:10000/default;principal=hive/ server@REALM.COM; auth=kerberos;kerberosAuthType=fromSubject |
Cloudera/Simba (no security) |
jdbc:hive2://server:10000;AuthMech=3;transportMode=binary |
Kerberos Cloudera/Simba |
jdbc:hive2://server:10000;AuthMech=1;KrbRealm=REALM.COM; KrbHostFQDN=server.example.com;KrbServiceName=hive |
where:
Is the DNS name or IP address of the system where the Hive server is running. If it is on the same system, localhost can be used.
Is the name of the default database to connect to.
Is the default port number HiveServer2 is listening on if not specified when the Hive server is started.
For a Kerberos enabled Hive server, this Is the name of your realm.
Is the name of the JDBC driver, for example, org.apache.hive.jdbc.HiveDriver or com.cloudera.hive.jdbc41.HS2Driver.
Defines the additional Java Class directories or full-path jar names which will be available for Java Services. Value may be set by editing the communications file or in the Web Console. Using the Web Console, you can enter one reference per line in the input field. When the file is saved, the entries are converted to a single string using colon (:) delimiters for all platforms. OpenVMS platforms must use UNIX-style conventions for values (for example, /mydisk/myhome/myclasses.jar, rather than mydisk:[myhome]myclasses.jar) when setting values. When editing the file manually, you must maintain the colon delimiter.
There are three methods by which a user can be authenticated when connecting to a database server:
Primary authorization ID by which you are known to the data source.
Password associated with the primary authorization ID.
Select a profile from the drop-down menu to indicate the level of profile in which to store the CONNECTION_ATTRIBUTES command. The global profile, edasprof.prf, is the default.
If you wish to create a new profile, either a user profile (user.prf) or a group profile if available on your platform (using the appropriate naming convention). Choose New Profile from the drop-down menu and enter a name in the Profile Name field (the extension is added automatically).
Store the connection attributes in the server profile (edasprof).
Topics: |
Connections to a Hive server with Kerberos enabled can be run in several ways:
This mode is useful for testing, but is not recommended for production deployment.
To setup connections to a Kerberos enabled Hive instance:
In this configuration, all connections to Hive instance will be done with the same Kerberos user ID derived from the Kerberos ticket that is created before the server starts.
kinit kerbid01
where:
Is a Kerberos ID.
jdbc:hive2://server:10000/default;principal=hive/server@REALM.COM
Set to Trusted.
-Djavax.security.auth.useSubjectCredsOnly=false
Once these steps are completed, the adapter can be used to access a Kerberos-enabled Hive instance.
In this configuration, each connected user has a Hive Adapter connection with Kerberos credentials in the user profile.
ENGINE SQLHIV SET ENABLE_KERBEROS ON
If you are running the server on a Windows system, under the Local System account, specify your realm name and Kerberos Key Distribution Center. To do so:
-Djava.security.krb5.realm=REALM.COM -Djava.security.krb5.kdc=kdc.realm.com
where:
Is the realm name of your organization.
Is the Key Distribution Center of your organization.
Is the URL to the location of the data source.
Server |
URL |
---|---|
Kerberos Hiveserver2 (User) |
jdbc:hive2://server:10000/default;principal=hive/server@REALM.COM; auth=kerberos;kerberosAuthType=fromSubject |
Kerberos Cloudera/Simba |
jdbc:hive2://server:10000;AuthMech=1;KrbRealm=REALM.COM; KrbHostFQDN=server.example.com;KrbServiceName=hive |
Set to Explicit.
Enter your Kerberos user ID and password. The server will use those credentials to create a Kerberos ticket and connect to a Kerberos-enabled Hive instance.
Note: The user ID that you use to connect to the server does not have to be the same as the Kerberos ID you use to connect to a Kerberos-enabled Hive instance.
Select your profile or enter a new profile name consisting of the security provider, an underscore and the user ID. For example, ldap01_pgmxxx.
This procedure will show you how to create a connection to a Kerberos enabled Hadoop Cluster using the Adapter for Apache Hive with security Password Passthru.
Is the URL to the location of the data source.
jdbc:hive2://server1:10000/default;principal=hive/server1@REALM.COM; auth=kerberos;kerberosAuthType=fromSubject
where:
Is the name of a node on your Hadoop cluster where a Hive2 server is running.
Set to Password Passthru.
Select the server profile, edasprof.
Is the name of the JDBC driver, for example, org.apache.hive.jdbc.HiveDriver.
Defines the additional Java Class directories or full-path jar names which will be available for Java Services (if not included in your system CLASSPATH).
/path/to/hive-jdbc-<version>standalone.jar
You also need to configure Hive as a DBMS Security Provider.
The DBMS Security Provider Configuration page opens.
Enter the provider, as desired. For example, hive.
Select SQLHIV - Apache Hive.
The connection name you specified, for example, server.
The DBMS Security Provider Configuration page opens.
Topics: |
The following sections will show you how to configure a DataMigrator or WebFOCUS Reporting Server for access to Hadoop through Hive with Kerberos enabled Hadoop Cluster using the Adapter for Apache Hive with single sign-on from the desktop to the browser using Hive.
Is the URL to the location of the data source.
jdbc:hive2://server1:10000/default;principal=hive/server1@IBI.COM; auth=kerberos;kerberosAuthType=fromSubject
where:
Is the name of a node on your Hadoop cluster where a Hive2 server is running.
Set to Trusted.
Select the server profile, edasprof.
Is the name of the JDBC driver, for example, org.apache.hive.jdbc.HiveDriver.
Defines the additional Java Class directories or full-path jar names which will be available for Java Services (if not included in your system CLASSPATH).
/path/to/hive-jdbc-<version>standalone.jar
A server started with security provider OPSYS can be configured for Kerberos connections.
To implement the single sign on Kerberos security:
The OPSYS Security Configuration page opens.
The edaserve.cfg file is updated with this attribute.
The odin.cfg file is updated with this attribute.
When the server is started, a user can connect to the Web Console from Internet Explorer without a prompt for user ID and password. The Login Info shows connection type Kerberos. The connection is done using the Kerberos ticket from the browser. The connected user ID is derived from this ticket.
Connection to the server requires that there is a local OPSYS user ID with the same name as the Kerberos user ID on the operating system running the server. This user ID is used for tscom3 process impersonation.
If a user signs off from the Kerberos connection, the user can make explicit connections with the local Unix user ID and password. Connection with another Kerberos user ID as an explicit connection will not work.
Once you configure a DataMigrator or WebFOCUS Reporting Server for access to Hadoop through Hive, you must also configure Java to use the HTTP service principal.
$ klist -k /etc/krb5.keytab
Confirm the credentials are stored correctly, as follows:
Keytab name: FILE:/etc/krb5.keytab KVNO Principal ---- -------------------------------------- 1 HTTP/server.example.com@REALM.COM
jconnection { com.sun.security.auth.module.Krb5LoginModule required debug=false doNotPrompt=true useKeyTab=true keyTab="/etc/krb5.keytab" storeKey=true isInitiator=false principal="HTTP/server.example.com@REALM.COM";; }; com.sun.security.jgss.accept { com.sun.security.auth.module.Krb5LoginModule required debug=false doNotPrompt=true useKeyTab=true keyTab="/etc/krb5.keytab" storeKey=true isInitiator=false principal="HTTP/server.example.com@REALM.COM";; };
-Djava.security.auth.login.config=/ibi/srv82/wfs/etc/jaas.conf
Here are some of the errors you may see when attempting to connect to a Hadoop cluster. These error messages will be prefixed by FOC1500.
Java listener may be down
ERROR: ncjInit failed. Java listener may be down - see edaprint.log.
This is a Java specific issue. Confirm that your Java version meets your system requirements for bit size (32 bit vs 64 bit), and that the value for JAVA_HOME is pointing to the correct Java version.
No suitable driver found for jdbc: ...
(-1) [00000] JDBFOC>> makeConnection(): conn is NULLjava.sql. SQLException: No suitable driver found for jdbc:hive1://server:10000/default
The name of the JDBC driver in the URL is incorrect. For example, in the above URL, the driver is hive1.
ClassNotFoundException
(-1) [00000] JDBFOC>> connectx(): java.lang.ClassNotFoundException: org.apache.hive.jdbc.HiveDriver
One of the jar files that comprises the Hive JDBC driver is not found in the CLASSPATH, or the driver name is incorrect. In the above example, the driver name is org.apache.hive.jdbc.HiveDriver.
Required field 'client_protocol' is unset!
(-1) [00000] JDBFOC>> makeConnection(): conn is NULLjava.sql. SQLException: Could not establish connection to jdbc:hive2://server:10000/default;principal=hive/server@REALM. COM;auth=kerberos;kerberosAuthType=fromSubject: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default})
This is due to a Java version mismatch. The version of Java on the client is newer than the one on the server. To resolve this error, update Java on the system where you are running our server.
Could not open client transport with JDBC Uri:
-1) [00000] JDBFOC>> makeConnection(): conn is NULLjava.sql. SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://server:10000/default;principal=hive/server@REALM. COM;auth=kerberos;kerberosAuthType=fromSubject: ...
This is a generic error, and could be due to any kind of error in the URL or configuration. For more details, read the rest of the message.
Unsupported mechanism type PLAIN
(-1) [00000] JDBFOC>> makeConnection(): conn is NULLjava.sql. SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://server:10000/default: Peer indicated failure: Unsupported mechanism type PLAIN
The cluster has Kerberos enabled and a Kerberos principal to authenticate against is not specified. To resolve this error, add the following to the end of the URL using your cluster and REALM name:
;principal=hive/server@REALM.COM
UnknownHostException: server
(-1) [00000] JDBFOC>> makeConnection(): conn is NULLjava.sql. SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://server:10000/default;principal=hive/server@REALM. COM;auth=kerberos;kerberosAuthType=fromSubject: java.net. UnknownHostException: name
The server you specified could not be reached. Either the system is down or the name is invalid.
Could not open client transport with JDBC Uri:... Connection refused
(-1) [00000] JDBFOC>> makeConnection(): conn is NULLjava.sql. SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://server:10000/default;principal=hive/server@REALM. COM;auth=kerberos;kerberosAuthType=fromSubject: java.net. ConnectException: Connection refused: connect
The hostname could be reached, but there is no Hive Thrift server running at the hostname and port number you specified
Client not found in Kerberos database
(-1) [00000] JDBFOC>> makeConnection(): javax.security.auth.login. LoginException: Client not found in Kerberos database (6)
The user ID (User Principal Name) specified is not a valid Kerberos user ID.
Pre-authentication information was invalid
-1) [00000] JDBFOC>> makeConnection(): javax.security.auth.login.LoginException: Pre-authentication information was invalid (24)
The user ID (User Principal Name) specified was found, but the password is missing or invalid.
GSS initiate failed
(-1) [00000] JDBFOC>> makeConnection(): conn is NULLjava.sql. SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://server:10000/default;principal=hive/server@domain. COM; GSS initiate failed Caused by: org.apache.thrift.transport.TTransportException:
Kerberos was unable to authenticate against the specified Service Principal Name.
This occurs if you are using static credentials that were obtained by kinit before the server starts, and did not specify the option to allow that. To resolve this issue, from the Web Console:
-Djava.security.auth.useSubjectCredsOnly=false
This also occurs if you are using subject credentials with explicit or password pass through security, but did not specify that you are doing so. Add the following to the end of the URL:
;auth=kerberos;kerberosAuthType=fromSubject
Could not create secure connection to ... Failed to open client transport
(0) [ 08S0] Could not create secure connection to jdbc:hive2://server:10000/default;principal=hive/server@REALM. COM;auth=kerberos;kerberosAuthType=fromSubject: Failed to open client transportjava.sql.SQLException: Could not create secure connection to jdbc:hive2://server:10000/default;principal=hive/server@REALM. COM;auth=kerberos;kerberosAuthType=fromSubject: Failed to open client transport
The client was not able to use Kerberos to connect. You did not enable Kerberos and are using subject credentials. To resolve this issue, add the following to the server profile:
ENGINE SQLHIV SET ENABLE_KERBEROS ON
Cannot locate default realm
(-1) [00000] JDBFOC>> makeConnection(): javax.security.auth.login. LoginException: KrbException: Cannot locate default realm
Kerberos looks for the default realm in /etc/krb5.conf, or on Windows krb5.ini, and it either cannot find or read the file or the default realm is not specified.
For a server running as a service on a Windows machine, you must explicitly specify the realm for the organization and the KDC. Review the file to make sure that the default realm is specified and correct. To resolve this issue, from the Web Console:
-Djava.security.krb5.realm=REALM.COM -Djava.security.krb5.kdc=kdc.domain.com
Click Save and Restart Java Services.
WebFOCUS | |
Feedback |