Thursday, 11 February 2016

Using DbDeploy in large globally distributed project

Recently we successfully implemented DBDeploy for managing database migrations in a large globally distributed project. It was a huge program with more than 200 developers working in separate teams, globally distributed in seven locations.
The database management was done manually. For every release developers used to prepare release notes, listing all the sql scripts which should be executed for that release.
It was chaotic to manage correct list of sqls to be executed. Worst yet, given an environment and db schema, no one ever knew which sql scripts are run on that schema to take it to current state. This process was definitely not scalable and was one of the biggest bottleneck in making successful releases.
To fix this situation, we decided to introduce DBDeploy for database change management. DBDeploy is a well known database change management tool, which is built around various ideas discussed in the book ‘Refactoring Databases’. Several tools like flyway and liquibase are available now, but we chose DBDeploy because it was easier to introduce (doesn’t need you to write XML like liquibase) and also had a feature to just output consolidated sql file from migrations, instead of executing all the migrations directly against database. This was crucial because DBAs were very keen to review all the changes going into database for every release.
DBDeploy kind of tools is a common practice now, and many people know about standard patterns and practices to use these tools, But when you have several teams working on a same codebase and distributed globally, there are some unique set of challenges you face.  Based on our learnings in last few months, following are some of the things which are crucial to successfully implement DBDeploy in large distributed teams, making changes to same database.


  1. Use Timestamp as a change number
DbDeploy expects each migration to be named with unique change number. Generally a simple integer is used. But when you have globally distributed teams, it’s impossible to use simple integer. When you have several developers distributed across locations, its very hard to know what next integer to use for your script. If the last script in version control system is say 001_create_customer_table.sql, and you add your script as 002_create_address_table.sql, there is high chance that someone has already checked in a script with id 002. Using Timestamp in the format yyyyMMddHHmmss_<scriptname>.sql solves that problem. There is less likelihood for the timestamp with millisecond granularity to collide.


     2.   Write a build task to generate migration scripts with correct timestamp.
Even if you expect everyone to know and follow rules to create correctly named migrations, people make mistakes. Unless there is a quick and easy way to generate timestamps for migrations, most developers will copy paste one of the existing scripts and change one digit in the timestamp value. So make sure you write a gradle or ant task to generate database migration files with correct timestamp format.


The gradle task we wrote is as following
def askTicket() {
def console = System.console()
def ticket
for (;;){
if (console) {
ticket = console.readLine('\n> Please enter your story or jira reference (e.g: DT-002): ')
} else {
logger.error "Cannot get console."
}
if (ticket ==~ /([A-Za-z])+\-[0-9]+/){
break
}
}
return ticket
}


def timestampExists(timestamp, location) {
return false
}


task newMigration {
    doLast {
def sql_path = "../../db/migrations"
def ticket = askTicket()


def timestamp = new Date().format('yyyyMMddHHmmss', TimeZone.getTimeZone('GMT'))


def newMigrationFile = "${sql_path}/${timestamp}-${ticket}.sql"
println ">> ${newMigrationFile}"


File configFile = file(new File("${newMigrationFile}"))
configFile.write("-- migration for '${ticket}' at '${timestamp}'")
    }
}


     3.  Have a CI build just for running DBDeploy migrations.
It is very useful to have CI build just for running db migrations against a separate CI database. Having a separate build helps because, typically db migrations run very quickly and do not depend on any other build pipeline.
On large monolithic projects, you typically tend to have long running build, which make it harder to get quick feedback (within seconds) of committing wrong migration.


     4.  Add Validations for migrations.
Often, it's not enough to have conventions and expect people to follow it. It helps when we have build that breaks when conventions are not followed. Simplest one to violate is the naming of scripts. There is no guarantee that every developer will use the build script you have to generate correctly named scripts. There will be scripts which have wrong timestamp value.
With timestamps, it's important to have in the form yyyyMMddHhmmss. One more or less digit and the order of execution will change.
There are other useful validations like DDL and DMLs should be in a separate file. There should generally be no ‘drop table’ statement.
We modified DBDeploy to allow adding these kind of validations before executing migrations. It was extremely helpful, to have these validations break the CI build.


Intercepting Filter in DBDeploy



In DBDeploy we added a generic intercepting filter which executes before applying all the new migrations.  This allowed us to plugin required functionality in DBDeploy. Executing validations or generating report from all the new migrations was easy to add then. Here is how you can specify intercepting filter in DBDeploy now.


def dbDeploy = new com.dbdeploy.DbDeploy(driver: "${driver}",
url: "${jdbcUrl}",
userid: "${jdbcUsername}",
password: "${jdbcPassword}",
scriptdirectory: new File("${migrations}"),
dbms: "ora",
changeScriptFilter: changeScriptValidatingFilfer,
undoOutputfile: new File("${migrationsPath}/undo_last_change.sql"))


dbDeploy.go()


ChangeScriptFilter which has a generic interface
public interface ChangeScriptFilter {
 void process(List<ChangeScript> changeScripts);
}


You can then implement ChangeScriptFilter as part of your build. Gradle makes it really easy to write groovy script to implement this filter and add this to part of the build. Adding Validations is just one usage of this filter.

public class ChangeScriptValidatingFilfer  implements ChangeScriptFilter {
public void process(List<ChangeScript> scripts) {
for(ChangeScript script:scripts) {
applyValidations(script);
}
}


Validations can be written something like this


public class DMLAndDDLSeparationValidator implements ChangeScriptValidator {
public boolean validate(com.dbdeploy.scripts.ChangeScript changeScript) {
String content = changeScript.getContent().toLowerCase();
if (hasDDL(content) && hasDML(content)) {
throw new com.dbdeploy.exceptions.ChangeScriptValidationFailedException(changeScript + " has create or alter table and insert/update. Please put DML and DDL in separate scripts");
}
return true;
}


private boolean hasDML(String content) {
return content.contains("insert into ") || content.contains("update ");
}


private boolean hasDDL(String content) {
return content.contains("create table ") || content.contains("alter table ");
}
}

    5. Have a dryRun task which DBAs can run before executing dbdeploy against production database.
It helped having a dryRun task which shows all the changes which will be applied when migrations are run. DBDeploy already has a feature which allows you have to specify ‘outputFile’ and then it consolidates all the migrations to be applied to given schema in that file.
It helped us to write some scripts to analyse newly applied migrations and generate a report to show what all tables are affected and if there are inserts or alters or updates in those tables.
Some of the tables are sensitive and DBAs like to if the current deployment might affect those tables in any way.
DryRun task was also added with a custom intercepting filter to analyse all the newly executing scripts. Here is a snapshot of script analyser.

class ChangeScriptAnalyzer implements ChangeScriptFilter {
def inserts = [:]
def updates = [:]
def creates = [:]
def alters = [:]


void process(List<ChangeScript> scripts) {
scripts.each { script ->
def tableAnalyzer = new util.TableAnalyzer(sqls: script.content.readLines())


putInMap(inserts, tableAnalyzer.getTablesWithInserts(), script)
putInMap(updates, tableAnalyzer.getUpdatedTables(), script)
putInMap(creates, tableAnalyzer.getNewlyCreatedTables(), script)
putInMap(alters, tableAnalyzer.getAlteredTables(), script)
}
}


Which then could be used to print reports as following.


Data is inserted in following tables (38)
JOB_MASTER
20151130061549-JIRA-122072.sql
20151216233622-JIRA-120792.sql
20160112210928-JIRA-120792.sql

JOB_CATEGORY
20151130061549-JIRA-122072.sql
20151216233622-JIRA-120792.sql

CONFIGURATIONS_MASTER
20151209080100-JIRA-126407.sql
20151209165500-JIRA-125508.sql
20151213045000-JIRA-122529.sql

Data is updated in following tables (21)
CONFIGURATION_MASTER
20151209165500-JIRA-125508.sql

WEBSERVICE_CFG
20151209165500-JIRA-125508.sql
20160105202700-JIRA-60012.sql
Following tables are newly created (13)

CUSTOMER_ADDRESS
20151210070136-JIRA-127509.sql


Following tables are altered (6)


CUSTOMER
20151209091755-JIRA-124737.sql
20151210070136-JIRA-127509.sql


CONTENT_MASTER
20151209180600-JIRA-126977.sql
20151215183001-JIRA-127777.sql


This helped the DBAs to identify if there is any sensitive table which is modified by current deployment and to have a closer look at migration which is modifying that table.


6. Stored procedures are not Migrations.
Do not put stored procedures in migrations. Stored procedures are compiled in database and all the procedures can be applied for every deployment. Adding a migration for every change in stored procedure, makes it extremely hard to track version history of that procedure.
So manage stored procedures as database source outside of migrations. Apply all procedures to database for every deployment. Have a the deployment script fail if there are compilation errors in stored procedures.


7.  Get consolidated list of all exceptions in the end.
This one is controversial, but it greatly helped us in initial days.
DBDeploy fails on the first sql exception it gets while executing migrations.  When you have globally distributed teams, managing their own databases, and especially with developers not used to automation and dbdeploy, they tend to manually execute migration scripts to fix issues.
This creates problem for deployments then, because dbdeploy execution fails with errors like Unique Constraint or Table already exists etc..
Imagine a deployment which is executing 100 migrations, and 10 of those migrations fail. If dbDeploy fails for each migration, its extremely painful to fix issue, and rerun dbdeploy.
Instead, if you get all the errors (of type table already exists etc…) at the end of run, it's lot easier to go through all the errors at once and then either make manual entries in changelog table or reach out to developers who wrote those migrations to understand why they are failing.
This is not something that you will need in a typical project and might not be the right thing to do as well. But this helped us a lot.  For getting all the errors in the end for specific error codes, we could specify it as following

def dbDeploy = new com.dbdeploy.DbDeploy(driver: "${driver}",
url: "${jdbcUrl}",
userid: "${jdbcUsername}",
password: "${jdbcPassword}",
scriptdirectory: new File("${migrationPath}"),
dbms: "ora",
changeScriptFilter: validatorProvider,
exceptionsToContinueExecutionOn: "ORA-00001,ORA-00955,ORA-01430,ORA-00957,ORA-01430",
undoOutputfile: new File("${migrationsPath}/undo_last_change.sql"))


dbDeploy.go()


This helped a lot in initial phase of roll out, when people were still running manual scripts and DBDeploy will give all the unique constraint or table already exists, kind errors at once in the end, instead of breaking for each error.

The code with all the changes mentioned above is available at https://github.com/unmeshjoshi/dbdeploy

Monday, 31 March 2014

Microservices and Agility

'Microservices' is a new buzz word our industry has found.  The idea is that a system is built with a set of independently managed services.  The idea is catching up as people are seeing more and more problems with the monolithic applications.


  • A bit of history..
Most enterprise applications today get developed in Java and JVM.  Around early 2000, when J2EE / Java was still gaining popularity,  distributed objects was the recommended way of developing enterprise applications. Books like J2EE design patterns pushed for that. People claimed that distributed objects were necessary for scaling.
In practice though, people were experiencing difficulties with the way distributed objects were creating problems. It was considered 'developer's nightmare'.
Rod Johnson, in his book 'Expert one - on -one J2EE development without EJB', claimed that POJO components with dependency injection was much better and lightweight approach to develop better applications. Monoliths are better for managing and can be scaled well as well.
Monolithic web applications using spring / or similar DI frameworks was the way most of the web applications were developed.

  • SOA

Came along, SOA, Service Oriented Architecture. The idea was to break the applications based on the different set of business capabilities and expose those as services, so that they can be consumed by outside clients. Unfortunately, SOA was quickly tied to SOAP and Enterprise Service Buses.
In last four to five years though, REST and HTTP have been clearly established as a de facto standard for exposing services.  This is the preferred 'light weight' approach.


  • Partitioning .
Exposing services along the lines of business capabilities is just one aspect of partitioning the system. By System I mean a software system which enables an organization to run their business.  There are complex relationships between software systems and an organization. The way organization is partitioned impacts the way software systems are partitioned and vice versa. Almost always, these decisions can not be taken by just business people or just software people. The whole point of Agile was to allow business people and software people to work together for that reason, as the quality of this relationship determines quality of software architecture. By Architecture I mean how the system is partitioned into modules or components. They should closely resemble the domain partitioning.



  • Forces for partitioning the system (organization and architecture)
Following are the most common reasons for partitioning the system.
  1. Different rate of change of parts of the system. :- There are parts of the system which change at different rate. For example on a flight booking website, insurance rules might change more frequently than say search options. In such scenario, it might be worth splitting system, so that insurance module can be changed and deployed indendently.
  2. Autonomy of teams working on different features :- Some times its worth splitting the system , so that different teams can work on different parts as independently as possible.  This is one of the common reasons for splitting monoliths. (This is actually driven by Conway's Law which states that parts of the systems reflects the way teams are structured)
  3. Independent domains :- Complex systems almost always have several independent business units. These are reflected in various different domains which are reflected in software system. Eric Evans calls these as 'bounded contexts' in his DDD book. It always beneficial to split the system on domain boundaries and have different modules (and teams) take care of these domains.
  4. Non functional aspects. :-  Different parts of the system can have different non functional requirements. Common requirements like scaling, availability, performance, security are some of the most important reasons for which  system needs to be split into parts. Monoliths often suffer with the problems that the whole monolith needs to be scaled even when a part of the system needs scaling. For example, in flight booking website, flight search is generally very expensive in terms of resources. Its well worth to split the application so that search components can be managed and scaled independently. It also better if these components can naturally map to teams, which can manage this component life cycle independently.  Key management parts of the API might need more secure storage, monitoring and network infrastructure than rest of the application. In that case, pulling key management as a separate service makes a lot of sense.
These four are the most common reasons to partition the system and organization. As we can easily see, the forces aren't just about the technical aspects, but in fact more about organizational aspects. This is the reason, we see Conway's Law mentioned almost always in any write up on micro services.
Of course just Conway's Law is not enough to guide us on how an organization should be structured and how the code should map to teams.
Jim Coplien's great book 'Organizational Patterns of Agile Software Development' has a pattern language called 'People and Code Pattern Language'. This language lists a set of patterns which are necessary for having successful partitioning of the system. Architect Controls the Product, Architect Also Implements, Architecture team, Stand Up Meeting are just to name the few.


  • Microservices or MicroApplications? 

As we can clearly see, there is broader problem of splitting or partitioning the software system and organization at stake here. One of the important parts of these systems is UI, either Web UI or otherwise. If we literally focus just on 'services', we can very easily fall in a trap of having a 'monolithic' web UI using granular back end services.  Considering 'verticals' including UI is essential when thinking of partitioning the systems. The term 'MicroServices' is probably misleading here. Its more about having 'Micro Applications'.


  • Its about being Agile
So, its all about individuals (Business people, architects and developers) and how they decide to implement software and build teams which develop this software. That's what being Agile is all about.
Mapping software artifacts to teams or people, and structuring them to make sure teams can work as effectively as possible is a very important aspect of software development. This is very commonly known as the 'Conway's Law'.
Conway's Law was at the heart of what Agile was. It was always the case. 
So when we say we need 'microservices architecture' we are really saying 'how can we be more Agile by having people and software in sync with business goals'. 
Unfortunately, Agile is thought of in a very very narrow sense now a days. Its generally limited to CI, CD, TDD, iterations etc etc..
Most of what we are talking about 'MicroServices', is really about 'Agile'. The meaning that was lost all these years, is getting rediscovered under a different name. 

Saturday, 6 October 2012

A Case for Use Cases for Agile Software Development

Most Agile methods today, use 'product backlog' to track requirements. A product backlog is an ordered list of deliverables. The most widely accepted form of listing these deliverables is User Story.
What is a 'User Story'? Mike Cohn defines it as 'A simple description of a feature told from the perspective of the person who desires the new capability'

An agile team is supposed to complete one or more user stories in an iteration.
That puts another constraint on user stories. They should be small enough so that they can be estimated and worked upon by a team in two to four weeks. So a user story, is not just a 'feature that a end user wants', but also a 'unit of work' which can be estimated and used for planning.

There is a tension between these two aspects. Agile teams, many times fall in trap when one is given more importance than the other. If 'User Value' is always asked for, the work items might become too big and there is a danger of it spanning multiple iterations. If small work items are created directly, there is a danger of losing big picture or 'context'.[2]. Experienced Agile teams generally create smaller stories and use techniques like 'Story Maps' to fill in the context. The larger stories which the smaller 'work item' stories map to are called 'Epics'.

There is another aspect of 'User Story'. Its supposed to be a reminder to have conversation. Its not just a documented requirement. The understanding of what needs to be built is dependent on the quality of the conversation which developers have with business people. But how do you make sure that the discussions which actually produce these stories are guided correctly?
'Agile' requirements gathering needs a framework to allow business people and developer to talk. For guiding these talks, the best approach is to discuss scenarios of end user interacting with the system.
And thats what a 'Use Case' is! Every use case exists because it helps achieve some actor a goal. We start with high level user interactions with the system. For every action that the system performs, we dig more and try to figure out why we have this step? Who wants it? (who is the actor) Why? (What’s the goal?). After we break down a high level flow into more granular ‘goals’,we have a set of use cases.
We first concentrate just on the happy path of the use case. A use case happy path is not more than 3 to 7 steps. Generally, lesser steps. its better.

Lets take an example of use case for a flight booking system. At very high level, following is the summary of user actions. This is called as a 'Summary level' use case.

User goal - Book a flight.
  1. User searches for a flight specifying source, destination airports and travel dates.
  2. Travel website displays flights matching user criteria
  3. Travel website accepts traveler information and preferences to build Itinerary
  4. Travel website stores traveler preferences and presents payment options
  5. User enters payment details
  6. Travel website processes payment and books flight
From this summary, we can come up with more detailed use cases.

Lets detail out 'Search Flights' use case.

User goal - User wants to get all possible flight options for travel
  1. User visits travel site.
  2. Travel website presents fields to enter departure, arrival airports, date of travel and travel time.
  3. User enters required information.
  4. Travel website communicates with GDS and presents list of available flights.
Now, how to come up with Product backlog items? (Or User stories).
The main driving factor for product backlog items is that they should be small enough to be worked upon in an iteration (2 weeks). Items like these help in release planning and also provide enough guidance for developers to work on.

A happy path scenario of a use case makes a good backlog item(read use story). Its extremely important to really really think about minimal happy path scenario. ‘Individual and Interactions’ help here.
A minimal happy path scenario, makes a perfect backlog item.

Once we have happy paths, use cases naturally help thinking about deviations. 
Each deviation, then can become a good product backlog item. (User story).

So, following is the first cut backlog that we come up with
  1. Search Flights
  2. Store traveler information
  3. Process payment
  4. Book flight with GDS.

Brainstorming deviations.

Once we have basic happy path scenarios, for each use case we think of deviations.


User goal - User wants to get all possible flight options for travel

Main Success Scenario:
  1. User visits travel site.
  2. Travel website presents fields to enter departure, arrival airports, date of travel and travel time.
  3. User enters required information.
  4. Travel website communicates with GDS and presents list of available flights.
Deviations:

    3a1 User enters wrong origin or destination
    3a2 Travel website presents possible airport code options.
     
    4a1 No flights are available matching criteria
    4a2  Travel website presents appropriate message   

These deviations are source of additional backlog items

An curious case where happy path scenario is too big a story for an iteration

To start with we had a backlog item called ‘search flights’. Discussing the happy path with developers, we think that in one iteration, we can do the work to display required fields and find airport codes for airports that user has entered. Finding these airport codes requires us to integrate with a system which has all the airport data available.
So we break down our first backlog item ‘Search Flights’ into two backlog items as follows.
  1. Accept flight search criteria and build search request. [Use Case step 2 and part of step 4, and deviation 4a]
  2. Search flights with Sabre GDS. [Use case step 2]
  3. Store traveler information
  4. Process payment
  5. Book flight with GDS.
  6. Provide airport options to user.
We included step 4a, (no flights found case) in the first item as developers thought it will help them build a vertical slice through the system.

From the above example we see that the Use Cases provide a very good thinking and brainstorming framework for discussions between business people and developers. 'Agile' teams should not deny a lot of wisdom that's already available with Use Case community.

References:

Alistair Cockburn has a very nice presentation, which talks about how use cases can be used in Agile projects at http://alistair.cockburn.us/Agile+Use+Cases

Jim Coplien’s book ‘Lean Architecture’ has very good explanation of how Use Cases help in Agile and Lean development.

Martin Fowler has a nice paragraph discussing relationship between Use Cases and User Stories
http://martinfowler.com/bliki/UseCasesAndStories.html

Books 'Patterns of Effective Use Cases' and 'Writing Effective Use Cases' should be 'must read' for Agile BAs along with Mike Cohn's 'User Stories Applied'.