Saturday, November 24, 2012

Scripted Reports with Groovy

Groovy has become my favorite scripting language and in this blog I look at some of Groovy's features that make it particularly attractive for presenting text-based reports. The post will show how custom text-based reports of data stored in the database can be easily presented with Groovy. I will highlight several attractive features of Groovy along the way.

I use the Oracle Database 11g Express Edition (XE) for the data source in my example in this post, but any data source could be used. This example does make use of Groovy's excellent SQL/JDBC support and uses the Oracle sample schema (HR). A visual depiction of that sample schema is available in the sample schema documentation.

My example of using Groovy to write a reporting script involves retrieving data from the Oracle HR sample schema and presenting that data via a text-based report. One portion of the script needs to acquire this data from the database and Groovy adds only minimal ceremony to the SQL statement needed to do this. The following code snippet from the script shows use of Groovy's multi-line GString to specify the SQL query string in a user-friendly format and to process the results of that query.

def employeeQueryStr =
"""SELECT e.employee_id, e.first_name, e.last_name,
          e.email, e.phone_number,
          e.hire_date, e.job_id, j.job_title,
          e.salary, e.commission_pct, e.manager_id,
          e.department_id, d.department_name,
          m.first_name AS mgr_first_name, m.last_name AS mgr_last_name
     FROM employees e, departments d, jobs j, employees m
    WHERE e.department_id = d.department_id
      AND e.job_id = j.job_id
      AND e.manager_id = m.employee_id(+)"""

def employees = new TreeMap<Long, Employee>()
import groovy.sql.Sql
def sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:xe", "hr", "hr",
                          "oracle.jdbc.pool.OracleDataSource")
sql.eachRow(employeeQueryStr)
{
   def employeeId = it.employee_id as Long
   def employee = new Employee(employeeId, it.first_name, it.last_name,
                               it.email, it.phone_number,
                               it.hire_date, it.job_id, it.job_title,
                               it.salary, it.commission_pct, it.manager_id as Long,
                               it.department_id as Long, it.department_name,
                               it.mgr_first_name, it.mgr_last_name)
   employees.put(employeeId, employee)
}

The Groovy code above only adds a small amount of code on top of the Oracle SQL statement. The specified SELECT statement joins multiple tables and includes an outer join as well (outer join needed to include the President in the query results despite that position not having a manager). The vast majority of the first part of the code is the SQL statement that could be run as-is in SQL*Plus or SQL Developer. No need for verbose exception catching and result set handling with Groovy's SQL support!

There are more Groovy-specific advantages to point out in the code snippet above. Note that the import statement to import groovy.sql.Sql was allowed when needed and did not need to be at the top of the script file. The example also used Sql.newInstance and Sql.eachRow(GString,Closure). The latter method allows for easy application of a closure to the results of the query. The it special word is the default name for items being processed in the closure. In this case,it can be thought of a a row in the result set. Values in each row are accessed by the underlying database columns' names (or aliases in the case of mgr_first_name and mgr_last_name).

One of the advantages of Groovy is its seamless integration with Java. The above code snippet also demonstrated this via Groovy's use of TreeMap, which is advantageous because it means that the new Employee instances placed in the map based on data retrieved from the database will always be available in order of employee ID.

In the code above, the information retrieved from the database and processed via the closure is stored for each row in a newly instantiated Employee object. This Employee object provides another place to show off Groovy's brevity and is shown next.

Employee.groovy
@groovy.transform.Canonical
class Employee
{
   Long employeeId
   String firstName
   String lastName
   String emailAddress
   String phone_number
   Date hireDate
   String jobId
   String jobTitle
   BigDecimal salary
   BigDecimal commissionPercentage
   Long managerId
   Long departmentId
   String departmentName
   String managerFirstName
   String managerLastName
}

The code listing just shown is the entire class! Groovy's property supports makes getter/setter methods automatically available for all the defined class attributes. As I discussed in a previous blog post, the @Canonical annotation is a Groovy AST (transformation) that automatically creates several useful common methods for this class [equals(Object), hashCode(), and toString()]. There is no explicit constructor because @Canonical also handles this, providing a constructor that accepts that class's arguments in the order they are specified in their declarations. It is difficult to image a scenario in which it would be easier to easily and quickly create an object to store retrieved data values in a script.

A JDBC driver is needed for this script to retrieve this data from the Oracle Database XE and the JAR for that driver could be specified on the classpath when running the Groovy script. However, I like my scripts to be as self-contained as possible and this makes Groovy's classpath root loading mechanism attractive. This can be used within this script (rather than specifying it externally when invoking the script) as shown next:

this.class.classLoader.rootLoader.addURL(
   new URL("file:///C:/oraclexe/app/oracle/product/11.2.0/server/jdbc/lib/ojdbc6.jar"))

Side Note: Another nifty approach for accessing the appropriate dependent JAR or library is use of Groovy's Grape-provided @Grab annotation. I didn't use that here because Oracle's JDBC JAR is not available in any legitimate Maven central repositories that I am aware of. An example of using this approach when a dependency is available in the Maven public repository is shown in my blog post Easy Groovy Logger Injection and Log Guarding.

With the data retrieved from the database and placed in a collection of simple Groovy objects built for holding this data and providing easy access to it, it is almost time to start presenting this data in a text report. Some constants defined in the script are shown in the next excerpt from the script code.

int TOTAL_WIDTH = 120
String HEADER_ROW_SEPARATOR = "=".multiply(TOTAL_WIDTH)
String ROW_SEPARATOR = "-".multiply(TOTAL_WIDTH)
String COLUMN_SEPARATOR = "|"
int COLUMN_SEPARATOR_SIZE = COLUMN_SEPARATOR.size()
int COLUMN_WIDTH = 22
int TOTAL_NUM_COLUMNS = 5
int BALANCE_COLUMN_WIDTH = TOTAL_WIDTH-(TOTAL_NUM_COLUMNS-1)*COLUMN_WIDTH-COLUMN_SEPARATOR_SIZE*(TOTAL_NUM_COLUMNS-1)-2

The declaration of constants just shown exemplify more advantages of Groovy. For one, the constants are statically typed, demonstrating Groovy's flexibility to specifying types statically as well as dynamically. Another feature of Groovy worth special note in the last code snippet is the use of the String.multiply(Number) method on the literal Strings. Everything, even Strings and numerics, are objects in Groovy. The multiply method makes it easy to create a String of that number of the same repeating character.

The first part of the text output is the header. The following lines of the Groovy script write this header information to standard output.

println "\n\n${HEADER_ROW_SEPARATOR}"
println "${COLUMN_SEPARATOR}${'HR SCHEMA EMPLOYEES'.center(TOTAL_WIDTH-2*COLUMN_SEPARATOR_SIZE)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR
print "${COLUMN_SEPARATOR}${'EMPLOYEE ID/HIRE DATE'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'EMPLOYEE NAME'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'TITLE/DEPARTMENT'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'SALARY INFO'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println "${'CONTACT INFO'.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR

The code above shows some more addictive features of Groovy. One of my favorite aspects of Groovy's GString support is the ability to use Ant-like ${} expressions to provide executable code inline with the String. The code above also shows off Groovy's GDK String's support for the center(Number) method that automatically centers the given String withing the specified number of characters. This is a powerful feature for easily writing attractive text output.

With the data retrieved and available in our data structure and with the constants defined, the output portion can begin. The next code snippet shows use of Groovy's standard collections each method to allow iteration over the previously populated TreeMap with a closure applied to each iteration.

employees.each
{ id, employee ->
   // first line in each output row
   def idStr = id as String
   print "${COLUMN_SEPARATOR}${idStr.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def employeeName = employee.firstName + " " + employee.lastName
   print "${employeeName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def jobTitle = employee.jobTitle.replace("Vice President", "VP").replace("Assistant", "Asst").replace("Representative", "Rep")
   print "${jobTitle.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def salary = '$' + (employee.salary as String)
   print "${salary.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.phone_number.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"

   // second line in each output row
   print "${COLUMN_SEPARATOR}${employee.hireDate.getDateString().center(COLUMN_WIDTH)}"
   def managerName = employee.managerFirstName ? "Mgr: ${employee.managerFirstName[0]}. ${employee.managerLastName}" : "Answers to No One"
   print "${COLUMN_SEPARATOR}${managerName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   print "${employee.departmentName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   String commissionPercentage = employee.commissionPercentage ?: "No Commission"
   print "${commissionPercentage.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.emailAddress.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println ROW_SEPARATOR
}

The last code snippet is where the data retrieved from the database is output in a relatively attractive text format. The example shows how handles in a closure can be named to be more meaningful. In this case, they are named id and employee and represent the key (Long) and value (Employee) of each entry in the TreeMap.

There are other Groovy features in the last code snippet worth special mention. The presentation of commission uses Groovy's Elvis operator (?:), which makes even Java's conditional ternary look verbose. In this example, if the employee's commission percentage meets Groovy truth standards, that percentage is used; otherwise, "No Commission" is printed.

The handling of the hire date provides another opportunity to tout Groovy's GDK benefits. In this case, Groovy GDK Date.getDateString() is used to easily access the date-only portion of the Date class (time not desired for hire date) without explicit use of a String formatter. Nice!

The last code example also demonstrates use of the as keyword to coerce (cast) variables in a more readable way and also demonstrates more leverage of Java features, in this case taking advantage of Java String's replace(CharSequence, CharSequence) method. Groovy adds some more goodness to String again in this example, however. The example demonstrates Groovy's supporting extracting the first letter only of the manager's first name using subscript (array) notation ([0]) to get only the first character out of the string.

So far in this post, I've shown snippets of the overall script as I explained the various features of Groovy that are demonstrated in each snippet. The entire script is shown next and that code listing is followed by a screen snapshot of how the output appears when the script is executed. The complete code for the Groovy Employee class was shown previously.

generateReport.groovy: The Complete Script
#!/usr/bin/env groovy

// Add JDBC driver to classpath as part of this script's bootstrapping.
// See http://marxsoftware.blogspot.com/2011/02/groovy-scripts-master-their-own.html.
// WARNING: This location needs to be adjusted for specific user environment.
this.class.classLoader.rootLoader.addURL(
   new URL("file:///C:/oraclexe/app/oracle/product/11.2.0/server/jdbc/lib/ojdbc6.jar"))


int TOTAL_WIDTH = 120
String HEADER_ROW_SEPARATOR = "=".multiply(TOTAL_WIDTH)
String ROW_SEPARATOR = "-".multiply(TOTAL_WIDTH)
String COLUMN_SEPARATOR = "|"
int COLUMN_SEPARATOR_SIZE = COLUMN_SEPARATOR.size()
int COLUMN_WIDTH = 22
int TOTAL_NUM_COLUMNS = 5
int BALANCE_COLUMN_WIDTH = TOTAL_WIDTH-(TOTAL_NUM_COLUMNS-1)*COLUMN_WIDTH-COLUMN_SEPARATOR_SIZE*(TOTAL_NUM_COLUMNS-1)-2



// Get instance of Groovy's Sql class
// See http://marxsoftware.blogspot.com/2009/05/groovysql-groovy-jdbc.html
import groovy.sql.Sql
def sql = Sql.newInstance("jdbc:oracle:thin:@localhost:1521:xe", "hr", "hr",
                          "oracle.jdbc.pool.OracleDataSource")

def employeeQueryStr =
"""SELECT e.employee_id, e.first_name, e.last_name,
          e.email, e.phone_number,
          e.hire_date, e.job_id, j.job_title,
          e.salary, e.commission_pct, e.manager_id,
          e.department_id, d.department_name,
          m.first_name AS mgr_first_name, m.last_name AS mgr_last_name
     FROM employees e, departments d, jobs j, employees m
    WHERE e.department_id = d.department_id
      AND e.job_id = j.job_id
      AND e.manager_id = m.employee_id(+)"""

def employees = new TreeMap<Long, Employee>()
sql.eachRow(employeeQueryStr)
{
   def employeeId = it.employee_id as Long
   def employee = new Employee(employeeId, it.first_name, it.last_name,
                               it.email, it.phone_number,
                               it.hire_date, it.job_id, it.job_title,
                               it.salary, it.commission_pct, it.manager_id as Long,
                               it.department_id as Long, it.department_name,
                               it.mgr_first_name, it.mgr_last_name)
   employees.put(employeeId, employee)
}

println "\n\n${HEADER_ROW_SEPARATOR}"
println "${COLUMN_SEPARATOR}${'HR SCHEMA EMPLOYEES'.center(TOTAL_WIDTH-2*COLUMN_SEPARATOR_SIZE)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR
print "${COLUMN_SEPARATOR}${'EMPLOYEE ID/HIRE DATE'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'EMPLOYEE NAME'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'TITLE/DEPARTMENT'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
print "${'SALARY INFO'.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println "${'CONTACT INFO'.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
println HEADER_ROW_SEPARATOR

employees.each
{ id, employee ->
   // first line in each row
   def idStr = id as String
   print "${COLUMN_SEPARATOR}${idStr.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def employeeName = employee.firstName + " " + employee.lastName
   print "${employeeName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def jobTitle = employee.jobTitle.replace("Vice President", "VP").replace("Assistant", "Asst").replace("Representative", "Rep")
   print "${jobTitle.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   def salary = '$' + (employee.salary as String)
   print "${salary.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.phone_number.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"

   // second line in each row
   print "${COLUMN_SEPARATOR}${employee.hireDate.getDateString().center(COLUMN_WIDTH)}"
   def managerName = employee.managerFirstName ? "Mgr: ${employee.managerFirstName[0]}. ${employee.managerLastName}" : "Answers to No One"
   print "${COLUMN_SEPARATOR}${managerName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   print "${employee.departmentName.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   String commissionPercentage = employee.commissionPercentage ?: "No Commission"
   print "${commissionPercentage.center(COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println "${employee.emailAddress.center(BALANCE_COLUMN_WIDTH)}${COLUMN_SEPARATOR}"
   println ROW_SEPARATOR
}

In this blog post, I've attempted to show how Groovy provides numerous features and other syntax support that make it easier to write scripts for generating readable and relatively attractive output. For more general Groovy scripts that provide text output support, see Formatting simple tabular text data. Although these are nice general solutions, an objective of my post has been to show that it is easy and does not take much time to write customized scripts for generating custom text output with Groovy. Small Groovy-isms such as easily centering a String, easily converting a Date to a String, extracting any desired character from a string based on array position notation, and easily accessing database data make Groovy a powerful tool in generating text-based reports.

No comments: