Blockchain Now: Summary and Current Uses

What if payments could be sent instantly, at near zero cost, with an unchangeable record visible to anyone with permission?

Can patient health care records be securely and immediately shared among health care providers, insurers and anonymously to researchers?

How can securities be trade with near real time settlement and all positions visible to regulators and market participants?

Each example is live today using blockchain. Ripple, an open source payments protocol, has reached agreements with US regulators to transfer funds. In Estonia, health care records and data for many other public services are shared on blockchain. In capital markets, NASDAQ’s Linq market enables transparency for shareholders of public companies.

Blockchain, also known as distributed ledger, has moved remarkably fast from concept to adoption by sophisticated companies. Blockchain began as the foundation for bitcoin. Bitcoin and ethereum, a related cryptocurrency, are gradually moving past early scandals with the bankruptcy of the Mt. Gox coin exchange and illicit commerce on the Silk Road market. Blockchain has evolved separately, and may likely generate value on a scale similar to innovations such as open source software or smart phones.

Blockchain can be viewed as a new way to store information: an unchanging, transparent, and immediately updated database updated identically across many independent nodes. It provides a virtual notary and timestamp for data. Fundamentally, blockchain enables trust between parties who do not know each other.

Blockchain works by first grouping transactions together to propose a new block. Each transaction goes through a mathematical formula that creates a unique value called a hash. The hashes are combined with data about the prior block to form a cryptographic problem. The problem is difficult to solve but easy to confirm. All the computing nodes on the blockchain cycle through guesses at an answer, which aids security by making it theoretically impossible to know which node will solve the problem. The solution becomes the identification for the new block. Each node then confirms the solution and uses the answer to sign the new block.

Any attempt at changing data in a blockchain is immediately evident. The number used to sign a new block is based on the number from the prior block, which in turn was derived from the preceding block. Any changes to prior data will break the chain and be readily apparent.

Business value for blockchain centers on providing a real time, shared view of data that cannot be changed. Blockchain works best with a network of users. Current applications within a single business include:

  • IBM applies blockchain to irrefutably record data from thousands of vendors, which accelerates and reduces risk to accounts payable.
  • BHP, the mining giant, uses blockchain to record rock and fluid samples from its partners.

Business to consumer applications currently focus on lowering cost and time to validate information, and preventing fraud. These are in earlier stages of development as of early 2017. Examples:

  • Blockcerts provides an open source platform for employers to rapidly confirm job applicants’ academic credentials.
  • Everledger stores data about diamonds to limit fraud and illegal payments.

Business to business applications center on trusted and low cost transactions:

  • Global payments, such as Visa’s recent preview of their B2B Connect service.
  • Trade finance is starting production use. Barclays recently began using distributed ledger in global trade, which removed paper documents and enable immediate verification. Maersk, the world’s leading container shipper, indicates that supporting business processes often cost more than moving freight. Maersk is piloting blockchain.

Life sciences supply chain integrity fits well, such as certifying chain of custody for medical devices and preventing drug counterfeiting.

Capital markets shows strong progress. Distributed ledgers can reduce costs and operational risk for post-trade clearing and settlement, while also providing immediate transparency to regulators and market participants. Banks have formed multiple consortiums. For example, Hyperledger, sponsored by the Linux Foundation, seeks to standardize blockchain technology. Key contributors include Digital Asset Holdings, led by JPMorganChase’s former CFO; IBM; and Intel. The largest banks are creating their own coins and block chains including Goldman’s SETLCoin, for post-trade settlement, while Citi is testing a possible “Citicoin” offering.

Distributed ledgers may ultimately produce the most value in healthcare and the public sector. In the US, a fragmented medical value chain complicates sharing data while inhibiting accountability for results. Many analysts expect blockchain to contribute to solutions. Blockchain also enables anonymously providing the large data sets necessary for precision medicine, such tracking how environmental exposures affect genetic expression.

Public sector applications include voting (Estonia), recording title to land (Cook County in Illinois [Chicago], Georgia, Ghana, Sweden), and validating public records (Kenya, including academic credentials). Some countries, like Senegal, are considering a blockchain-based currency.

Adoption risks include rapidly evolving technology, and cautious regulators. US financial regulators recently expressed concern. The B2B marketplaces of the dot com era may offer an analogy: hundreds of startups ultimately produced only two enduring winners in Alibaba and eBay. Also, while researchers generally respect blockchain security, nation-state actors and their hacker proxies can find ingenious means to penetrate systems.

Looking forward, blockchain, and a related category called distributed applications (DApps), may evolve into open source business processes that are not controlled by any central company. Consider the market for car sharing: currently, companies like Turo or Zipcar serve as hubs showing availability, certifying quality, and enabling insurance and payment. An open source business process could disintermediate these functions by aggregating car availability, attesting to quality through user reviews (like rating Uber or Lyft drivers), and enabling payment. Insurance may need to adjust, but ultimately regulation follows real value creation.

Blockchain adoption has moved rapidly because of clear business value. Currently, uses center on certifying information about people and the provenance of valuable items, as well as simplifying supply chain communication within a single company. In the next few years, companies now performing pilots in groups and industry consortiums will shift toward using blockchain as the primary way to perform certain business processes. Most activity centers on financial services, and post-trade clearing and settlement in particular.

Further out, lessons learned in smaller countries and local government will engender broader adoption for public records at the national level. Ultimately, blockchain’s ability to enable trust across people and parties will lead to open source business processes that disintermediate many current business models without any central, for-profit company.

Blockchain’s business value is simple: transparency and trust. Blockchain, and its successors, is evolving to bring the same widespread interoperability we expect from email, cell phones, or web sites to performing transactions and sharing information between companies.

A Simple Guide to Machine Learning

Senior business leaders increasingly recognize data as a core asset and analytics as a necessary competency. Conversations with business leaders show a need for clarifying terms such as predictive analytics, machine learning, and deep learning.

Business analytics begin with static reports describing past events. Business intelligence explains trends and causes for previous results, often with interactive and visual drilldowns. Business intelligence remains the core of most analytics, and much value remains to be extracted at most companies.

Business intelligence pre-calculates how facts, such as sales, relate to dimensions, such as product or region. BI answers complex questions by creating “cubes” that organize, calculate, and aggregate facts across dimensions. Recent advances center on relatively easily-learned visual dashboards, like Tableau or PowerBI, that flexibly answer business questions by holding facts in memory rather than retrieving data from slow-moving hard disk drives.

Trusted data from business intelligence often forms the basis for advanced analytics. In highly regulated industries, such as financial services or pharmaceuticals, requirements to share comprehensive and quality data with regulators mean compliance applications are also a strong foundation for analytics.

Advanced analytics can be viewed as a progression:

·       Predictive analytics, which often refers to static models.

·       Machine learning, which continually learns from data without programming.

·       Deep learning, a subset of machine learning simulating how our brains function.

·       Artificial intelligence, a broadly used term often meaning a business application shows consumers some form of person-like intelligence.

Predictive analytics generally refers to models that implement core data science algorithms which stay constant once programmed. Common predictive models include cross sell, “market basket” anticipating products typically bought together, and predicting customer churn. Predictive models are often most effective when transparently embedded within business processes, such as scoring leads before they are sent to salespeople. Three main categories include:

·       Classification, assigning categories, such as customers likely to churn.

·       Clustering, finding patterns within data like customer segmentation.

·       Regression, predicting a value such as a sales forecast.

Sourcing data and managing model lifecycles varies considerably by industry. In capital markets, obtaining standardized market data is simple (though more involved when synchronizing low latency data), concepts like cyclicality or seasonality are rarely observed, and models begin to lose predictive power rapidly. By contrast, consumer propensity models often involve extensive cleansing and linkage of disparate data sets, seasonality of behaviors is very strong around holidays, and models tends to remain effective longer.

Machine learning models are trained by sending data through an algorithm instead of programming, though programming is typically required to assemble the algorithm. Machine learning often uses similar math as predictive models but continually evolves with new data. Business value includes making better decisions by incorporating subtle nuances in data that may be less apparent using the more linear relationships identified in most predictive analytics models.

Machine learning algorithms are often classified by supervised and unsupervised learning. Supervised learning trains models with data which has a known result, often tagged by humans. For example, a model predicting offers to show on a web site can use data that includes whether consumers made a purchase. Unsupervised models use data without any prior designation of results. They explore relationships and can identify new answers and findings as data streams evolve.

Reinforcement learning is a subset of machine learning that has received less publicity, but which holds strong potential for business applications. Reinforcement learning predicts how a set of independent agents each seek to maximize their return in a given context. While people and market participants may not always act rationally, reinforcement learning predicts actions by viewing how and why we achieve goals rather than inferring likely actions based on past behaviors and attributes.

Deep learning is a sub-set of machine learning that simulates a brain’s structure. Inputs are passed through many layers (“deep”) of calculation cells that act like neurons by amplifying or reducing data before passing it on to one or more other cells in the next layer. Deep learning models excel at perceiving unstructured data such as video, voice, natural language, and handwriting.

A key business value of deep learning is more fully understanding customers’ actions and risks such as improving merchandising by seeing how customers move about stores, sensing sentiment in call center conversations, or gleaning product feedback from social media.

However, deep learning models may not show why they reached a conclusion. Opaque decisions may not fit well in applications like credit decisions where businesses need to document decisions and demonstrate financial inclusion to regulators.

Ensemble models combine many individual models and categories of models. Many powerful business applications of machine learning use ensembles. IBM Watson is an example.

Artificial intelligence generally refers to deep learning. Academically, the term often evokes the concept of general intelligence: a synthetic intellect with broad cognitive abilities including interpersonal understanding and real world context. In practice, AI often refers to sophisticated forms of deep learning that present a human-like interface.

Five Fast Steps Toward Cybersecurity

Cybersecurity defense can bring to mind a brilliant engineer pounding away at a keyboard to heroically outwit an attacker, or expensive investments in specialized technologies. The reality is different. Preventing most attacks does not require cybersecurity investments. Consistent processes to keep employees alert, control access to applications, back up data, and keep software updated will limit most vulnerabilities for most businesses.

Businesses should take a fact-based approach to maximize cybersecurity ROI. Few will face advanced threats, and should simply take prudent steps to make an attack difficult and unlikely to pay off. Top threats for most organizations include ransomware, loss of customer and pricing information, and unauthorized payments. Businesses that rely upon web-based applications should also guard against denial of service attacks that degrade applications by sending a flood of network traffic.

Ransomware is more prevalent than many business leaders may suppose. I’ve recently met with a law firm and a pharma company that were deeply shaken when data was ransomed, but will understandably keep the incidents private. Symantec’s 2016 Internet Security Threat Report found a surge in ransomware over the last year including new attacks on mobile devices.

Even where sophisticated attackers have breached organizations with highly sensitive data, several incidents show that diligent employees and common sense discipline limit vulnerability. For example, when state-sponsored hackers from China stole data on 80 million people from Anthem, they used social engineering to trick their way into accessing five accounts, then extracted information from databases that lacked any encryption.

Similarly, the breach of the US government’s Office of Personnel Management started with social engineering to trick an employee at a subcontractor into installing a backdoor. The attackers then used the contractor’s trusted credentials to access the OPM network. They found systems far behind on patches to update software, and no basic controls for sensitive data like two-factor authentication.

Steps with the highest ROI to mitigate the most likely threats include:

  1. Train employees to be aware of broad phishing and targeted spear-phishing email.
  2. Patch both servers and desktops soon after software vendors release updates.
  3. Regularly review application permissions. Configure CRM systems to match authorization to real world needs.
  4. Back up data remotely off the corporate network where it is not vulnerable to ransomware.
  5. Write quality code. Tight code is secure code. Use code reviews and automated scans. Separate development from test, ensure QA environments nearly match production and use real world test data, and make rigorous test case management a leadership priority.

Regarding application permissions, don’t accept default settings. For example, a departing Morgan Stanley employee took data on 350,000 wealth management accounts because no access limits were in place. On Salesforce, confirm the “organization-wide sharing default”. For Dynamics, thoughtfully define roles, teams, and permissions. For custom applications, focus on impersonation tools where client service operators take on the identities of customers. For all applications, impose quarterly reviews where all managers explicitly confirm their directs’ permissions.

Businesses offering web-based applications also need to be aware of DDOS, or distributed denial of service attacks, where a perpetrator floods targeted systems with network traffic. In particular, businesses providing retail consumers with web-based services need to defend against attacks by unhappy customers.

Securing against a denial of service attack requires technology-centered defense beyond vigilant employees, consistent process, and thoughtful controls. The most immediate defense is to host applications in the cloud to leverage the scale economies enjoyed by an Amazon, IBM, or Microsoft.

In contexts requiring internalized hosting, such as for low latency applications, businesses should use DDOS mitigation services such as Arbor Networks, Neustar, or Verizon-perhaps in addition to a managed security service. Even when using a DDOS mitigation service and managed security service, IT teams should train to understand how to recognize an attack, rehearse, and invest time to build relationships and understand their security partners.

Businesses depending upon intellectual property or confidential information should go well beyond these basics. Industries like semiconductor design, capital markets trading, or health insurance need more advanced defenses requiring significant investment. However, many industries do not.

Is cybersecurity important? Of course-but the most effective steps to secure most businesses are not specifically about cybersecurity. Alert employees, updated software, precise and updated access controls, and remotely backing up data are the core steps toward reliable operations.

Scaling Legacy Applications

We recently met with several prospective clients that need to scale existing applications. This article summarizes key elements of organizing for and specific technical approaches to achieving scalability.

Our team generally comes from capital markets trading backgrounds. For many securities, achieving the best price depends on low latency to capture bids and offers available for only milliseconds. We are finding that many design patterns used in high volume, low latency transactional applications also apply to high volume analytics platforms. Techniques originating in capital markets can effectively scale many applications.

Organizing the Project

First, confirm facts. Break out logical architecture showing types of components; physical architecture showing where which elements share the same process, core, CPU, or datacenter; and datacenter architecture with servers, storage, and networks. Compose a calendar of key operational events such as loading data or compiling database statistics. Ensure people get hands-on to validate actual facts. Experience shows that actual architecture often differs from what people expect, particularly in M&A and post-merger integration contexts.

Get hard facts on real users’ experience. Define specifically, with product managers and perhaps directly with customers, the priorities for which parts of the user experience to improve. Capture performance across time of day.  For example, we’ve accelerated several products for Asian users to perform fast while UK and US securities markets open and close. The customers’ experience differed considerably from developers in Chicago measuring response times in a test environment. Importantly, set a baseline that shows performance at the start of the project.

Set audaciously achievable goals. Use an easy to remember guide, such as “all web pages will return in under 2 seconds”. Assuming some risk on setting expectations is reasonable, because most applications can be readily accelerated. Tell the team that performance reviews will include achieving the goal. Track and communicate results every day. At Citigroup Global Equities, for example, we sent latency numbers for market data to business users in a short email that showed the key number in the subject line, while including a graph for trends in the body; users could click through to a simple web page that visualized performance over time using an OLAP cube.

Try to define a time budget that lays out where an application is spending time. For example, break out average times to render in a browser, communicate to web servers, generate web pages, execute logic, and perform database queries. Individual transactions may differ, but time budgets can help a team focus their efforts.

Finally, assess if a team has necessary skills. Focus on database development. Databases are a primary reason for slow performance. Organizations can have database administrators who know operations, and application developers who code business logic. However, database performance crosses both areas where optimizing a query requires understanding the business intent and the database design. For applications where scalability directly drives business results, confirm that the team is strong in JVM tuning, Linux kernel parameter tuning, and, if appropriate, multi-threaded design patterns and testing.

Identify and prioritize a set of actions. A good first step is simply to ask engineers. Even where they cannot articulate why, people often have an intuitive sense of where to focus. Separate quick wins from more fundamental changes. Start quick wins immediately after capturing performance baselines, and even while confirming current state facts and overall project plans. One consistent quick win is to update software components, such as ODBC or JDBC drivers. These still need rigorous testing (Oracle tends to be riskier than Microsoft, particularly when applying multiple bulk patches to Exadata) but should be among the first changes to production.

Front End Responsiveness

Focus engineers on users’ perceptions rather than engineering concepts. For example, consider page rendering times within users’ browsers and network time to your web servers. Think through how a user uses an application, e.g. can a drop down box be populated by a smaller “Top N” query rather than returning all values? Can widgets be placed “under the fold” meaning that data can be populated while a user scrolls down? Pay particular attention to off the shelf grid controls. Experience shows that grid controls from even trusted names may need to be rewritten to be responsive with large data sets.

Web sites often respond well to pre-generating and caching expected high volume web pages. For example, with an investment management application at Thomson Reuters, we dedicated several web servers to continually pre-generate high volume web pages and cache the content on customer-facing web servers. The application identified companies with new earnings forecasts or newly published research, and pre-built the content for these pages immediately.

Check session management when working with large numbers of web servers. Web farms can slow down when coordinating how to associate specific users and data with a given web server. Consider using a central database or other technique that may be faster than the default option.

Identify third party dependencies on web sites, like payment systems, early and start action immediately. These will require more time to accelerate.

Dashboarding packages like Tableau and Qlik often work by querying a proprietary in-memory structure built from underlying data. In many cases, this is the best solution. However, software companies have invested many decades optimizing relational database performance. Test having dashboards use a pass through mode that leverages the database.

Finally, consider the overall topography of your application. Platforms with big, relatively static content like video should be distributed globally closer to users. Consider using a content delivery network. Applications with many volatile items, like transaction processing, are often fastest with big network connections to a large central infrastructure.

Business Logic

Experience leading technology for over twenty years in performance-sensitive industries can be distilled to two axioms. First, use the fewest types and instances of technologies. This means achieving simplicity, scale economies, and depth of expertise by limiting the variety and number of open source frameworks, development languages, operating systems, servers, etc. This rule is not an absolute as complex, global organizations will present many people and financial dynamics, but leaders should continually encourage simplicity.

Second, for scalability, limit the number of “hops” in a business process. A hop can be defined as moving across process boundaries, a data transformation, or network trip. For example, at a large investment bank, we set a goal of a minimum of 1000 events per second with single digit millisecond latency end-to-end across key business process flows.  A key part of achieving this goal involved measuring the number of hops, which were often over twenty initially, and reducing them to twelve or less.

In most industries, where applications should be fast but scalability does not directly affect business results, optimizing business logic will have lower return on effort invested, or at least rapidly declining marginal returns. Start by using profiling tools to identify bottlenecks in system performance. Check for efficient use of memory, including keeping data in CPU cache for performance-sensitive quantitative applications, and watch for chatty applications that make many calls to a data source.

Static code quality scanning tools can be effective. Leaders should be sensitive to the people aspects of introducing these tools, as less strong developers can feel threatened while senior engineers are often competitive. Tight code is also more secure code, and many static analysis tools also check for cybersecurity vulnerabilities.

When scaling cloud components, watch costs carefully. Several times, we’ve found non-linear spikes in costs when scaling to large virtual machine instances. If possible, build and validate a cost prediction model in a spreadsheet.

Pay attention to multithreading. Performance gains often come with greater risk of introducing subtle bugs (race conditions), require more QA rigor, and limit agility by requiring more specialized expertise. Ask developers if performance can be achieve with tight, single threaded code on a CPU with a high clock frequency. Consider using Intel Performance Primitive and their other tools; we have experienced great results with Intel’s tools.

For very high performance applications in complex business environments, we have produced strong results by separating functional business logic from “plumbing” like persistence (storing data) or orchestrating the steps in a business process. Larger numbers of more junior developers used a highly productive language like Java to create front ends and business functionality. We then provided wizards that generated code enabling access to high performance framework components created in C or x86 assembly. Conceptually, this approach has parallels to J2EE or other app servers but with the extreme performance and control enabled by creating proprietary high performance components.


Databases are the most common culprit in slow applications, and their own complex sub-project. Start with facts: measure memory usage (paging, page faults, cache hit ratios), CPU behavior, and data i/o speed from storage.

Query tuning, including index design, is the single most likely avenue to accelerate performance. Identify long running queries, as well small but frequently executed queries such as for authentication and authorization.

The slowest part of any application is disk heads moving across platters. Keep logic in memory wherever possible. An effective technique is to pin tables (or partitions within tables) wholly within memory. Check for nuances: many databases will pin data once it is queried off disk, but will not proactively bring data into memory. At one big bank, we ran SELECT *.* queries as part of each morning check out to pre-load data in memory. Think big, and consider using massive amounts of memory and solid state drives. Even with the current focus on big data, the reality is that many valuable business applications involve data sizes well under the 3TB of memory that can be put in a basic HP DL580 Gen 9 server.

Next, consider pre-calculating derived values that will accelerate later calculations. This often requires stepping back to consider the totality of queries and business processes. Consider using times of the day and week where loads are lighter to pre-calculate data. Even with recent technologies that reduce power for idle CPUs, the incremental power and operational risk to use available CPUs to pre-generate data is often a good investment.


Infrastructure is less often a critical factor in application scalability. One exception is storage performance. Legacy applications can often benefit from moving key tables into flash storage. This can take some experimentation to find the optimal data to keep in solid state, and will require changes as usage patterns change over time.

Watch logical to physical design of databases, where thoughtful layouts should enable parallel loading and queries without conflicts. In particular, watch partition design where a single logical database table is spread across many physical and virtual parts. Often, the simplest approach is fastest, e.g. add a new partition for each day.

Buying performance through hardware is often a reasonable investment for small and mid-sized platforms (those with perhaps under ~100 servers).  Engineers can sometimes resist the idea of using hardware to get performance from imperfect code, but it can be a practical approach that produces known results. Hardware can also be sooner to show results. For example, a team I led found a memory leak in a critical application. Of course, the best solution was to fix the bug producing the memory leak. However, we could not predict when that would be ready and could not take down a critical application. Instead, we immediately sent people to buy memory (DIMMs) at J&R across from New York City Hall. They drove to the data center in New Jersey and snapped them in within three hours. We fixed the memory leak two days later.

Sustaining Performance

Fast performance needs to be sustained by process. Usage patterns will change over time, which affects performance. Over the years, I’ve sensed an inherent entropy in complex applications, which administrators sometimes refer to as “code rot”. The huge array of variables that affect performance cannot be quantified but will inevitably lead to slower performance over time.

Database teams in particular should continually allocate capacity to identify and tune long-running queries as a core part of their role.

Customers and business sponsors will generally focus on functional requirements. IT leaders need to ensure that non-functional requirements are captured, budgeted, and confirmed. New releases should have explicitly defined volume and latency numbers. QA environments need to have realistic data, real world test cases, and infrastructure replicating production environments. In very high performance situations, consider “testing in production”. For example, at a hedge fund, we budgeted for and sent small live orders to US stock markets to confirm latency.

Nearly every legacy application can be significantly scaled with limited investment. Key points are to confirm facts, set a performance baseline, profile code, and focus on database query performance. Applications will naturally slow over time. Make scalability an on-going process to achieve business results.

Can Agile Techniques Deliver to Budgets and Timelines?

Technology leaders starting big, new projects have asked us this question several times. In short: yes. The key is showing constant progress realizing business outcomes. This contrasts with waterfall approaches that attempt to anticipate need for large sets of specific features, and then set a date and budget resulting in a single binary event to define success.

IT leaders should emphasize budgeting, rather than estimation, in the initial stage of a large or new type of project. Keep scope tight and time horizons short. Big picture business strategy can be planned over long periods of time, but IT plans should be kept as near term as practical.

Big and new types of projects should start with an initial waterfall phase, particularly in globally distributed organizations. Business strategy, specific tangible goals, workflow processes and use cases should start as documents. Later, when processes and applications are in place, agile techniques with a “code talks” approach can shift to using visual mockups.

Agile diagram


Pilots and mockups should start during this planning phase. Pilots provide learning that enables estimating time and budgets.  Pilots should always intend to build production code, but leaders should accept that design phase pilots may also prove that technologies or design patterns will not work.

As design concludes, IT teams should start shifting pilots into a disciplined and consistent rhythm. I’ve found that monthly releases provide a strong, simple guideline. Individual teams may choose to have sprints within each month.

Adopting a dependable iteration cadence has an important effect: building confidence with the business. Change to release plans is inevitable. IT leaders cannot anticipate new customer RFPs, regulatory change, or M&A over long periods. At the same time, adding last minute features to a release demotivates teams and introduces risk. A regular cadence makes business leaders confident that features can be shifted between dates yet reliably delivered.

Every sprint should show some progress to the business, but not all features need to be completed within one sprint. A key tenet of agile is delivering in small, well-tested increments. This means that big features that span releases should still be built in discrete, fully-tested iterations even before released to production.

Success does not have to be defined as trying to anticipate what to deliver on distant dates in waterfall processes. A consistent cadence of delivering concrete progress toward business outcomes, while adjusting for inevitable change, will show business sponsors that the IT team delivers.

How to Start Agile Development

We recently met with the chief architect of a company starting a large project to reinvent their core application platform. They previously used off the shelf software and so had less reason to use agile practices. He asked: how can we start agile development?

Agile fundamentally breaks big problems into small increments. The approach emphasizes code over documents, and continually involves users or customers. Agile is also a mind-set: teams practically deliver and refine in iterations rather than starting with an extended period of document-driven analysis to arrive at a complete solution.

Agile can be adopted in stages. At the CIO level, start with basics including iterative delivery, continuous integration where developers all check their code in to confirm fit, and unit testing that confirms that a small element of code performs as intended. Steer away from theoretical debate about the various flavors of agile practices, like Kanban or scrum, and start with practices common to all forms.

The most important aspect of the first phase is maintaining a disciplined cadence of iterations. This allows business sponsors to build confidence in the development team, and shows that DevOps (developer operations: the group operating the infrastructure for agile practices) is effective. IT leaders should be firm that each sprint delivers well-tested code approaching readiness for production. This catches problems early and provides a foundation for making nimble changes as needed.

In the first phase, IT leaders are just getting people into a regular rhythm. A second phase can increase precision. Continuous integration can be extended to include static tests for code quality and security, and automate performance and functional tests. Developers should target a higher percentage of code coverage for unit tests.

A third phase can move past automating integrating and testing code to include deployment into production. Advanced agile practices can also measure code quality and providing feedback to developers. For example, our equities technology team at Citigroup built a system called Gauge that provided a broad range of metrics on developers’ quality through interactive visualizations. It was eagerly adopted, and became part of talent review practices.

Starting agile techniques is about leading people more than engineering. CIOs should encourage bottom up, organic adoption rather than setting top down imperatives. IT leaders should seek out junior managers who are passionate about agile practices, and have them build coalitions.

At Thomson Reuters, we formed teams led by junior managers who recommended how to adopt agile practices. This Socratic approach also tends to result in more concrete and immediate changes compared to changes imposed by senior leaders who may be less aware of day to day processes.

In summary, adopt agile in iterations. Find junior people to act as sponsors and build grass roots support. Start with fundamentals of iterations, continual integration, and unit tests. Get basics in place and show business sponsors that IT dependably delivers on a disciplined cadence. Later, expand automated testing and deployment. In my experience, these practices will motivate people, improve quality, and accelerate execution.

A New Approach to Large IT Projects

Digital business models and transforming core applications can drive strong business value. However, most large IT initiatives fail to meet expectations. McKinsey & Company found that 66% of big IT projects exceed budget and 33% run longer than planned[1]. Businesses need a new approach.

Many companies concentrate on defining requirements, then turn to engineering solutions. IT leaders often approach their organizations as a machine: specifications in, code out. However, initiatives such as legacy platform modernization or post-merger integration involving changing how people work, require new team capabilities, and bring new risks. Experience leading several large transformations shows CIOs should first motivate people, build required team capabilities, and structure processes to minimize risks.

Transform Attitudes, not Technology

Technology leaders can refer to some people as ten times as productive as others. However, many more have the potential, and the concept applies also to teams. Leaders unlock this productivity by focusing on people, rather than outputs.

Engineering managers are often selected for expertise as much as for empathy. A first step in technology transformation is equipping managers to lead change. A CIO should explicitly cultivate effective change leadership. The Kotter 8-Step Change Model[2] provides a straightforward approach. Ask for written change leadership plans to build confidence and encourage taking responsibility.

The American politician Tipp O’Neill said “all politics is local”. The same applies to transforming enterprise applications. Leaders should ensure every person understands what change means for them. Share positive opportunities such as broader career options or learning new technologies. People should understand that they are valued for business and customer understanding rather than technical skills.

Communication should explain why as much as what will change. Consider getting strategists and engineers together to explain the competitive landscape and how projects produce value for customers. Seek opportunities for engineers to join customer meetings: even with strong product managers, many design decisions, such as how to organize data, are best made by directly understanding the customer environment.

Offshore locations need to feel respected to achieve top productivity. People will deliver best when given full responsibility rather than serving as support. Consider assigning end-to-end responsibility for part of a project, with local managers accountable. At Citigroup, for example, we experienced a strong surge from an already-productive Sydney team when given responsibility globally for all retirement and superannuation analytics.

CIOs often get best results leading talented engineers to their own decisions. At Thomson Reuters, we formed working groups headed by junior managers that recommended how to implement change. This enables teams to self-attribute and “own” actions. Junior people will also often identify very concrete and immediate actions, which leads to credibility and early wins. 

Realistically Assess and Build Capabilities

A simple test can assess capabilities: has a person or team recently delivered a similarly scaled project?  Agile program management, quality assurance, and user experience design are most likely to need strengthening. Consider bringing in people from software vendors or companies that regularly deliver new versions. Avoid the temptation to save by having project leadership performed solely by managers.

The ultimate success of many large projects depends on quality assurance. Managers often assess the ratio of testers to developers, which is meaningful, but a small part. Quality (and cybersecurity, which depends in part on quality code) should be built into development from the start. Program managers should ensure test cases are defined, with developers creating small “unit tests” continually applied in an automated framework.

Early in a project, the CIO should collaborate with finance on investments for test environments. Consider using cloud services, which can shift capital expenditures to operating expenditures. Test environments and data should be as similar as possible to production.

Test case management and trend analysis is critical. Solid test case management checks the full spectrum of risks, identifies dependencies such as customer data, and identifies trends and emerging risks. Importantly, strong test case management enables accurately forecasting completion dates.

At the same time, do not rely wholly on structured testing. The intuition of senior engineers and client service people can be very valuable for identifying risks or behaviors not captured in test plans. Test plans should specifically include experts trying to break the application.

User experience design is increasingly important to product strategy. Professional design costs little relative to the value of elegant workflows, screen designs, and visualization. Designers are also critical to a “code talks” approach to agile development, where teams use frequent visual mockups to confirm evolving business needs.

Deliver Fast and Frequently

Waterfall methodologies are inherently mismatched to large projects. Annual capital budgeting cycles can prompt development teams to think in similar timeframes. However, customer needs, RFP details, or M&A are difficult predict over long periods. Big projects should use many, smaller iterations to show results early and confirm alignment regularly.

At the CIO level, insist on broad principles of agile delivery rather than a specific methodology: small phases of two to six weeks (monthly works well); emphasize mockups and pilots rather than documents; and have developers synchronize and perform automated tests on code each day. Individual locations may choose specific agile methodologies, such as Kanban or scrum, which work best within a given geographic location.

Business sponsors naturally focus on functional requirements. Technology leaders should treat non-functional requirements, such as scalability or cybersecurity, as equal priorities. These are achieved more cost-effectively early in the process, and often involve investments and tradeoffs best steered by the CIO.

All big IT projects are business projects. Transformational projects should involve all parts of the business. At the start, CIOs should involve peers in communication campaigns. An easy option is sharing short video interviews with business leaders. Each agile iteration offers to get the whole business involved. Gaining alignment does not end with development: QA is often strongest with test cases defined and performed by client services and others who will use the product.

Go Live Gracefully

Bringing new software releases into production is a critical point for motivating people and limiting risk. Consider a global credit card payment processor creating a new core platform. The waterfall plan called for development for six months, followed by four months of testing, and then a launch event coordinated with sales and service. However, several customers shared RFPs prompting the team to add features. Testing will be compressed. The CIO gets nervous, and demands more reports. Managers set additional meetings to prepare these updates. With pressure mounting, a key engineer gets sick. As the launch approaches, a salesperson calls a development manager directly to persuade her to add a feature; there will not be time to test this feature.

The team begins installation on a Saturday night. However, the application unexpectedly fails to communicate with a disaster recovery location. This was never tested in the small QA environment. Saturday gives way to Sunday as engineers frantically try to diagnose the issue. Senior managers consider cancelling the launch. However, reverting to the prior version requires manually reconnecting to dozens of other systems and the main expert is out sick. The team must continue the launch.

On Sunday morning, the CIO visits the operations team. Her presence further raises anxiety. The CIO overhears a senior developer complain that this is the fourth time a troubled product launch forced her to scramble to find child care on short notice. As the Asian day begins, customers are unable to use the application. The CIO feels her phone vibrate. The CEO is calling.

Alternatively, consider the same company using processes centered on motivating people and containing risk. Sales leaders saw IT consistently deliver new iterations of the product for many months. They trust IT will continue this pattern, and agree to defer some features until after launch.  The CIO made the case to investment in test environments nearly matching production, and test case management has tracked non-functional requirements including disaster recovery for several months.

As the launch approaches, the team installs new software on a parallel production environment while keeping existing systems. Final checks are performed during regular hours in the week leading up to launch. IT plans a six hour maintenance window, but the actual cutover and final automated checks take about an hour. Many customers do not notice. The CIO wakes at home on Sunday morning, glances at her phone to check production volume, and reminds herself to ask the CEO to recognize key people.


Technology transformation comes from people. Rather than the traditional focus on business requirements and engineering, IT leaders starting large new projects should concentrate first on leading change and communicating effectively. Projects like legacy platform modernization bring new demands on an organization. Leaders need to assess whether the team has the project leadership, quality, and user experience design capabilities needed. With testing, consider the readiness of test environments and data as much as people capabilities. In development, break big problems into small parts delivered in a disciplined cadence of iterations. Tactfully insist that each iteration approaches production readiness. Finally, use automation, rehearsals, and parallel environments for graceful deployment. Each successful release will build team momentum and execution velocity.

For more on leading execution in IT, please see Faster IT or Third Derivative.