Apache Zeppelin — (JDBC Interpreter Configs)

Naween Banuka
7 min readFeb 17, 2020

Hi readers,

In my previous article, I was talking about how to install Apache Zeppelin and MariaDB ColoumnStore on the dockerize environment.

Before start reading this, I highly recommend reading the previous article.

In this article, I will discuss two basics things,

  1. How to config Zeppelin interpreter for MariaDB.
  2. How to generate your first chart using your data.

An interpreter is a pluggable layer for backend integration. More than 20 interpreters available in the official Zeppelin distribution package, and much more available as a 3rd party projects.

You can learn more about interpreters in this article. (Interpreter mode explained).

We only discuss here about the JDBC Interpreter.

Before doing any configs you need to login to Apache Zeppelin Web Ui. By default (not by default, configs according to your docker-compose file) Zeppelin is running on 8080 port.

localhost:8080

The MariaDB-Zeppelin repo we used to install zeppelin, comes with user authentication. All those authentication processes handled by Apache Shiro.
(Learn More About Apache Shiro)
Apache Shiro config was stored in the shiro.ini file.

/zeppelin/mariadb-columnstore-docker/columnstore_zeppelin/zeppelin/conf

Sample shiro.file is looked like this

# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the “License”); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an “AS IS” BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
[users]
# List of users with their password allowed to access Zeppelin.
# To use a different strategy (LDAP / Database / …) check the shiro doc at http://shiro.apache.org/configuration.html#Configuration-INISections
# To enable admin user, uncomment the following line and set an appropriate password.
#admin = password1, admin
#{{demo_user}} = {{demo_pass}}, readers, runners
#{{admin_user}} = {{admin_pass}}, readers, runners, writers, admin
# Sample LDAP configuration, for user Authentication, currently tested for single Realm
[main]
### A sample for configuring Active Directory Realm
#activeDirectoryRealm = org.apache.zeppelin.realm.ActiveDirectoryGroupRealm
#activeDirectoryRealm.systemUsername = userNameA
#use either systemPassword or hadoopSecurityCredentialPath, more details in http://zeppelin.apache.org/docs/latest/security/shiroauthentication.html
#activeDirectoryRealm.systemPassword = passwordA
#activeDirectoryRealm.hadoopSecurityCredentialPath = jceks://file/user/zeppelin/zeppelin.jceks
#activeDirectoryRealm.searchBase = CN=Users,DC=SOME_GROUP,DC=COMPANY,DC=COM
#activeDirectoryRealm.url = ldap://ldap.test.com:389
#activeDirectoryRealm.authorizationCachingEnabled = false
### A sample for configuring LDAP Directory Realm
ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
## search base for ldap groups (only relevant for LdapGroupRealm):
ldapRealm.contextFactory.environment[ldap.searchBase] = dc=mcmedisoft,dc=com
ldapRealm.contextFactory.url = ldap://ldap:10389
ldapRealm.userDnTemplate = cn={0},ou=people,dc=mcmedisoft,dc=com
ldapRealm.contextFactory.authenticationMechanism = simple
### A sample PAM configuration
#pamRealm=org.apache.zeppelin.realm.PamRealm
#pamRealm.service=sshd
### A sample for configuring ZeppelinHub Realm
#zeppelinHubRealm = org.apache.zeppelin.realm.ZeppelinHubRealm
## Url of ZeppelinHub
#zeppelinHubRealm.zeppelinhubUrl = https://www.zeppelinhub.com
securityManager.realms = $ldapRealm
## A same for configuring Knox SSO Realm
#knoxJwtRealm = org.apache.zeppelin.realm.jwt.KnoxJwtRealm
#knoxJwtRealm.providerUrl = https://domain.example.com/
#knoxJwtRealm.login = gateway/knoxsso/knoxauth/login.html
#knoxJwtRealm.logout = gateway/knoxssout/api/v1/webssout
#knoxJwtRealm.logoutAPI = true
#knoxJwtRealm.redirectParam = originalUrl
#knoxJwtRealm.cookieName = hadoop-jwt
#knoxJwtRealm.publicKeyPath = /etc/zeppelin/conf/knox-sso.pem
#
#knoxJwtRealm.groupPrincipalMapping = group.principal.mapping
#knoxJwtRealm.principalMapping = principal.mapping
#authc = org.apache.zeppelin.realm.jwt.KnoxAuthenticationFilter
sessionManager = org.apache.shiro.web.session.mgt.DefaultWebSessionManager### If caching of user is required then uncomment below lines
#cacheManager = org.apache.shiro.cache.MemoryConstrainedCacheManager
#securityManager.cacheManager = $cacheManager
### Enables ‘HttpOnly’ flag in Zeppelin cookies
cookie = org.apache.shiro.web.servlet.SimpleCookie
cookie.name = JSESSIONID
cookie.httpOnly = true
### Uncomment the below line only when Zeppelin is running over HTTPS
#cookie.secure = true
sessionManager.sessionIdCookie = $cookie
securityManager.sessionManager = $sessionManager
# 86,400,000 milliseconds = 24 hour
securityManager.sessionManager.globalSessionTimeout = 86400000
shiro.loginUrl = /api/login
[roles]
readers = *
writers = *
runners = *
admin = *
[urls]
# This section is used for url-based security. For details see the shiro.ini documentation.
#
# You can secure interpreter, configuration and credential information by urls.
# Comment or uncomment the below urls that you want to hide:
# anon means the access is anonymous.
# authc means form based auth Security.
#
# IMPORTANT: Order matters: URL path expressions are evaluated against an incoming request
# in the order they are defined and the FIRST MATCH WINS.
#
# To allow anonymous access to all but the stated urls,
# uncomment the line second last line (/** = anon) and comment the last line (/** = authc)
#
/api/version = anon
# Allow all authenticated users to restart interpreters on a notebook page.
# Comment out the following line if you would like to authorize only admin users to restart interpreters.
/api/interpreter/setting/restart/** = authc
/api/interpreter/** = authc, roles[admin]
/api/configurations/** = authc, roles[admin]
/api/credential/** = authc, roles[admin]
#/** = anon
/** = authc

All required values were stored in a .env file which is located in the columnsotre_zeppelin folder.

/zeppelin/mariadb-columnstore-docker/columnstore_zeppelin/

My .env file looks like this.

 MARIADB_CS_DEBUG=
MARIADB_CONTAINER_NAME=mariadb/columnstore:1.2
MARIADB_ZEPPELIN_CONTAINER_NAME=mariadb/sandboxzeppelin
MARIADB_DATABASE_SERVICE=localhost
MARIADB_RANDOM_ROOT_PASSWORD=
MARIADB_ROOT_PASSWORD=pass
MARIADB_ALLOW_EMPTY_PASSWORD=1
MARIADB_USER=zeppelin_user
MARIADB_PASSWORD=zeppelin_pass
MARIADB_DATABASE=bookstore
MARIADB_CS_NETWORK_SPACE=10.0.6
MARIADB_UM1_BINDING=3306:3306
MARIADB_NUM_BLOCKS_PCT=1024M
MARIADB_TOTAL_UM_MEMORY=512M
ZEPPELIN_CONTAINER_NAME=mariadb/sandboxzeppelin
ZEPPELIN_WEB_HOST=0.0.0.0
ZEPPELIN_WEB_PORT=8080
ZEPPELIN_WEB_DEMO_USER=demo
ZEPPELIN_WEB_DEMO_PASS=highlyillogical
ZEPPELIN_WEB_ADMIN_USER=buff
ZEPPELIN_WEB_ADMIN_PASS=lesshighlyillogical

This is the sample setup of authentication in Zeppelin. In my next article, I’ll explain how to connect LDAP for the advance authentication process.

Back to the point, Now you have your hardcoded login details.

username : buff
password : lesshighlyillogical

Let’s login to Zeppelin using these credentials.

Now let’s try to config JDBC Interpreter and create a connection to our columnstore docker container.

Click the profile name on the right side upper corner. And then click Interpreter

First, click + Create button at the top-right corner in the interpreter setting page.

Fill Interpreter name field with whatever you want to use as the alias(e.g. MySQL, mysql2, hive, redshift, and etc..). Please note that this alias will be used as %interpreter_name to call the interpreter in the paragraph. Then select jdbc as an Interpreter group.

The default driver of JDBC interpreter is set as PostgreSQL. It means Zeppelin includes PostgreSQL driver jar in itself. So you don't need to add any dependencies(e.g. the artifact name or path for PostgreSQL driver jar) for PostgreSQL connection. The JDBC interpreter properties are defined by default like below. If you want to connect other databases such as Mysql, Redshift and Hive, you need to edit the property values. You can also use Credential for JDBC authentication

The last step is the Dependency Setting. Since Zeppelin only includes PostgreSQL driver jar by default, you need to add each driver's maven coordinates or JDBC driver's jar file path for the other databases.

So now we successfully create a connection to our coloumstore database.

Let’s start the second part of the article.

Binding JDBC interpreter to notebook & generating the first chart

To bind the interpreters created in the interpreter setting page, click the gear icon at the top-right corner.

Select(blue) or deselect(white) the interpreter buttons depending on your use cases. If you need to use more than one interpreter in the notebook, activate several buttons. Don’t forget to click Save button, or you will face Interpreter *** is not found error.

Run the paragraph with JDBC interpreter

To test whether your databases and Zeppelin are successfully connected or not, type %jdbc_interpreter_name(e.g. %mysql) at the top of the paragraph and run show databases.

%jdbc_interpreter_name
show databases

If the paragraph is FINISHED without any errors, a new paragraph will be automatically added after the previous one with %jdbc_interpreter_name. So you don't need to type this prefix in every paragraphs' header.

Now you know your interpreter work as expected.

Here are some example query & results.

You can change your filters in the settings and try out different results from your data set. When you’re playing around with those parameters and filters, you can get a proper idea of how those filters work.

So, I hopefully think this article will help you to understand zeppelin JDBC connection & configs.

In the next article, I’ll discuss how to integrate the LDAP server for authentication in a zeppelin.

If you have any questions, please comment or send a pm to my LinkedIn.
I’ll reply ASAP.

My Linkedin: https://www.linkedin.com/in/naween-banuka

Thanks for reading.

Let’s meet in the next article. Until then.. cheers…!

--

--

Naween Banuka

I’m a passionate Software Engineer having more than 5 years of experience and expertise in Java, Spring-Boot, Microservice, Docker, Apache Spark, and Kafka.