Top Infrastructure Automation Architecture Considerations
In this article
Within the last three to five years, our customers and technology providers have increased their emphasis on adopting infrastructure automation concepts and leveraging application programming interfaces (APIs). The technology providers have equipped their products with programmable interfaces to allow rapid deployment and management of their hardware and software offerings.
However, many customers continue to struggle with transforming their organization to enable NetDevOps within their IT operations.
NetDevOps from an architectural approach
To understand why organizations have lagged in transforming their IT operations to a NetDevOps mindset, let's examine what an infrastructure automation architecture entails. The Open Group Architecture Framework (TOGAF) concept describes an IT system as a set of building blocks that integrate to accomplish the system's mission. These building blocks encompass standards, a shared vocabulary and a list of products (software and hardware) that implement the building blocks.
There are industry transitions that influence this architectural approach. The role and skills required by IT technologists are changing; these factors are accelerating the transformation:
- A decline in proprietary software usage with an accelerated use of open-source software.
- Cloud compute, storage and networking adoption (IT spend) is increasing 6:1 over on-premise IT.
- Organizations with higher degrees of digital maturity operate more cost-effectively and with increased revenue than digital laggards.
The current GeoHealth crisis has forced companies to rapidly provision, deploy and scale network, security and compute resources. In most cases, infrastructure programmability, NetDevOps, has become a requirement for meeting the deadlines imposed by these events.
Principles of DevOps applied to IT operations
We ask ourselves, what DevOps concepts apply to IT operations? Fundamentally, the functions of IT operations must be viewed as a flow of work through the system. This system is the organization's data center and associated campus, branch, WAN and cloud connectivity for on-premise IT.
Data centers exist to provide value to the consumers of their services. This flow of work is a value stream. While value stream mapping is commonly associated with the operations of hospitals and assembly lines, IT operations must view their services as a part of the IT operation value stream. The configuration of VLANs (Virtual LANs) and Fibre Channel zones in themselves have no value unless part of, and a requirement of, creating some service to the business.
Once IT operations managers view their daily activities as part of a holistic value stream, they can begin to align their organization with an infrastructure automation architectural approach and implement NetDevOps.
Acknowledging these trends and accepting the role of IT operations in the technology value stream necessitate a change in how IT managers train, hire and organize their IT technology teams.
Core competency
Let's examine some of the organizational behaviors necessary to develop an infrastructure automation architecture and adopt NetDevOps principles.
Open source contributions
Enabling a small group of engineers who primarily focus on developing and contributing to open source projects provides immediate recognition for the organization within their industry. View this as an investment in the training and enablement of your IT operations staff. Those contributing will learn and hone skills and best practices that can be transferred and encouraged internally.
Every automation effort within IT operations should be structured and managed as an open source project. In the open source world the phrase, "let the community decide," is democratizing what good looks like. Many infrastructure automation efforts lack adequate documentation, inconsistent and inadequate programming techniques and program structure.
Successful open source projects must compete for user interest and contributors. The same applies to internal automation solutions. Processes that eliminate toil and enrich an employee's work-life balance will be embraced and expanded.
Automation library
For infrastructure automation to reach full adoption within an organization, individual or team efforts must be recognized and organized into an automation library. The library concept does not imply that all automation resources be hosted at a single site, or under one group or organization. Consider sites like Cisco Code Exchange, Ansible Galaxy or DockerHub. These sites provide a front-end 'skin' to various code repositories.
This automation library represents the intellectual property of an IT organization that encourages adherence to best practices and allows for scalability and re-usability of code across the organization.The library should include references to both in-house developed software solutions and external open source projects that make up the building blocks of the automation architecture.
Center of excellence
The IT organization must create a center of excellence (CoE) for developing, educating, training and widely deploying automation services for its customers. This CoE is responsible for managing and maintaining the Automation Library and fostering the education and training of NetDevOps engineers. The CoE members also lead the external technical marketing and represent the organization at industry events, local meetups and conferences.
The CoE members evaluate and recommend the hardware and software tools, which comprise the building blocks of the automation architecture. CoE members develop and publish (internally or externally) the solutions required to integrate the building blocks.
Skills development
Skills development, up-skilling, training and team members' education are critical enablers to transforming into the NetDevOps mindset.
Many organizations fail at providing training opportunities for employees to maintain competency in their current role, given the accelerated rate of change before the IT technologist. Every team member should be a mentor, have a mentor, develop new skills, build relationships and learn all aspects of the IT infrastructure.
Learning effectiveness requires these three elements:
- Lecture, or presentation of the concept or technology;
- Hands-on use of the software, API, commands and code; and
- Quiz with instant feedback and application of the ideas.
Grouping individual topics within a learning track (overall syllabus) provides the framework to bring the employee to a level they can pass an exam or deploy on projects. At WWT, developing the next generation of network engineers is an emphasis of our DevNet Study Groups.
To become a good writer, you must read good writing. That same concept applies to developing code and automation workflows. Peer reviews of project deliverables are a necessary part of the skills development process. To quote Colonel Glover S. Johns: "Never be satisfied. Ask of any project, how can it be done better?"
This concept of continuous improvement is instrumental to the Toyota Production System and the third principle of DevOps.
Organizational structure
The organizational concept of a "team of teams," minimizing the emphasis on the hierarchical model aligned with function or business unit, is key to successfully implementing the concept NetDevOps. Organizing teams with a representation of different groups and skills focused on a specific outcome has been advocated by several authors and researchers, including Bill Drayton and Gen. Stanley McChrystal.
Familiarize yourself with the Walt Disney visual organization structure drawn in 1957, the Walt Disney Mind Map. In this diagram, the core business focused on his creative talent and the production of theatrical films. The ancillary businesses — 16mm films, publications, licensing — are shown as is their relationship and interaction supporting the core competency. The diagram illustrates an organizational mesh rather than a top-down hierarchy.
Many organizations hinder the adoption of infrastructure automation and NetDevOps concepts due to rigid and siloed organizational structures. To be successful, organizations must see IT operations as a workflow, and each member of the team must view their role as part of the system.
Every team member must have a basic understanding of the overall goal and how their area of responsibility supports that workflow. This approach is called "systems engineering" or "systems thinking" and was adopted by NASA for the successful Apollo program.
Organizations that are successful in automating infrastructure, at their core, have team members who think like a computer scientist. From Knuth we learn that programming is both art and science. Creative thinking does not align with hierarchical organizational structures. Top-down command and control squash innovation.
Creative people under this corporate structure do one of two things. They either leave or channel their creativity in other aspects of their life. Neither of these choices accelerates the mission of the organization.
A successful infrastructure automation architecture must recognize and accept that success depends on the organizational structure.
Operational methodology
Develop the muscle memory to build infrastructure as a service, reflect on the current operational methodology and consider how current practices align (or where there are gaps) with DevOps and the Twelve-Factor App principles.
Consider some of these suggestions during the development of an infrastructure automation architecture:
- All institutional knowledge is documented to where a new employee can 'run' the system.
- All infrastructure automation intellectual property is maintained in a shared source code repository.
- Inclusion in the source code repository requires adherence to a minimum level of structure, format and quality, especially documentation. Pushing to the repository requires peer review, static analysis (linting) and testing of all deployments.
- Any projects or initiatives must be delivered by small teams, including members having diverse skill sets in the target domain(s) and automation tooling.
- Every team member is a mentor and has a mentor and engages with them on a regular basis.
- One source of truth for configuration data. Device configurations are not the source of truth.
- Provide an owner, expiration time and effective time for all deployments. Refer to RFC 3139: Requirements for Configuration Management.
- Credentials in clear text will not be stored in code or repositories, preferably with a defined credential management strategy.
- Environmental variables are used to deploy into a topology. These run-time options will take precedence over defaults.
- Enable logging of all deployments. All team members must have access to the logs.
Where is the organization positioned on a data maturity model? Is access to configuration, logging and performance data tribal knowledge or available and scalable? Rate the organization on the Process Maturity Model. Are they undocumented and person dependent or refined and continuously being improved?
Summary
Infrastructure operations managers struggle to transform their legacy processes to more efficient Infrastructure as Code methods, while their industry is transitioning from private to the public cloud. Additionally, their technology teams must be re-tooled and their organizational and operational methodology must adapt and transform. Perhaps the great hindrance is that of corporate structures that hamper collaboration and innovation.
Ready to learn more? Request a briefing today.