The Federal Financial Institutions Evaluation Council (FFIEC) is a governmental body that provides interagency regulatory guidance for financial institutions. This provides a consistent framework for different regulatory bodies, and applies to the OCC, Federal Reserve, CSRB, and others. In June 2021, following large cyber attacks on the United States and the resulting Executive order on Cyber security, the FFIEC released the largest update in guidance in over a decade to help financial auditors assess financial institutions. It is an update to the 2004 Operations book, and links the different processes of Architecture, Infrastructure, and Operations (AIO) into a cohesive framework for auditors to assess. In this booklet the FFIEC discusses the principles and practices for IT and operations, as well as processes for addressing risk respective to IT systems. It also spells out management oversight of IT systems, to include governance, and provides principles for auditors to address.
Sweeping in scope, the AIO guide is very detailed on guidance for different technologies. While its more prescriptive than prior booklets, this leads to simplicity through clarity on what is expected of an organization. This blog is intended to provide a summary to cover a few aspects of this booklet, while providing reference for additional information. Its audience is technology executives and leaders who do not need to be regulatory experts, but who are affected by the regulations, and should have a better awareness. The booklet carves out responsibilities for different roles within an organization, from board and senior manager responsibilities, to chief architect and operations management responsibilities. Some of these roles include:
- Board and Senior Management – responsibilities include strategic planning, and enterprise risk management. Using an enterprise risk management framework, and providing the framework for architecture, asset management, ongoing monitoring, roles, responsibilities, and procedures for all AIO activities.
- CIO/CTO – responsibilities include overseeing and maintaining the functions of architecture, infrastructure, and operations in the IT environment, as well as delegation (and separation) of duties for these functions
- Chief Architect – role includes developing the enterprise model and establishing common blueprints and architectures. Ensuring consistency across lines of business and matching business results to the enterprise architecture.
- Chief Data Officer – role includes defining data strategies that are able to meet the organizations needs consistently while being able to meet compliance and security objectives
- IT Operations Management – ensures safety and soundness of the mission critical infrastructure required to enable business functions
- IT Operations personnel – cover the breadth of day to day actions supporting the digital requirements of a business infrastructure, including network, device, server, and environmental management
This blog even as summary is quite long. There simply is a ton of information in the AIO guide (100 pages) that would be relevant to different groups in IT leadership. I will summarize key points and extrapolate on the requirements, and provide links back into the guide where additional information may be available. Some key points in the guidance:
- An organization must focus on all three aspects of governance: People, Process, and Technology
- A strategy for segmentation helps satisfy numerous criteria and is referenced throughout the guide
- An organization must have a strategy and processes for infrastructure management, hardware lifecycle, and software patching
- Analytics and detection controls are required to ensure the infrastructure is functioning as expected and in line with expected controls
An example of the guidance to examiners, for all of these items, can be useful to groups who want to ensure their are conformant to specifics of an evaluation.
Common AIO Risk Management Topics
The FFIEC AIO guide defines certain functions as having risk management aspects in all components of the lifecycle, architecture, infrastructure AND operations. These common risk management topics are classified into the below domains.
- Data governance and data management
- Business and IT environment representation
- Managing change in AIO and change management
- Oversight of third-party service providers
- Remote access
- Personally owned devices
- File exchange
There is a fair amount of coverage for each of these Risk Management Topics within the AIO guide but here is a summary of key points for leadership awareness.
Data governance and data management include a number of exam criteria which are common for a data centric approach. Exam criteria include data identification and classification, controls for safeguarding data (both digital and physical), discovery and monitoring of new and existing data sources and changes, security of databases and management tools, and processes for patching and managing databases.
ITAM is IT asset management and includes processes to track, manage, and report on information and technology assets. Exam criteria include policies and standards, hardware and software inventories, processes in place to address and manage end of life and technically obsolete equipment, and processes to prevent and manage unapproved technology (shadow IT).
IT and Business Environment Representations is validation of documentation which is the representation of the organization’s IT infrastructure. It includes network diagrams, data flow diagrams, business process flow and narratives. It would include all physical and virtual technology assets, including locations and versions. It would also include all interconnectivity and process flows between lines of business and third parties, as well as locations of sensitive data.
Managing change in AIO and change management ensures that management has a change management process in place, and that any IT system or service is supported by an orderly and documented process. There are aspects of change management that would be strategic and the process should include appropriate request and review aspects. It should seek to ensure the integrity of the infrastructure and consistency of documentation.
Oversight of third-party service providers are processes that are in place to ensure appropriate levels of governance apply to third parties, with regards to service level agreements and appropriate data management policies.
Resilience validates that processes are in place to ensure that hardware and software assets, as well as business processes, have appropriate controls in place to maintain confidentiality, integrity, and availability of the institutions information systems. It deals with business continuity and mitigating disruptions whether internally or externally driven.
Remote-access covers the policies and procedures in place for communicating through the entities perimeter for internal and external third party access. It includes tunneling (VPN), portals for client devices, direct application access, and remote desktop services.
Personally owned devices are the practices for allowing personal devices, should they be allowed, in the infrastructure while maintaining appropriate security and segmentation.
File Exchange covers the risk considerations in file exchange and covers how an organization exchanges files, whether it be via attachments to email, file sharing services, or other means. Some examples of risk considerations includes the protocols used for the file exchange, the data storage and retention of files, integrity of the data, as well as managing shadow IT and handling data leakage.
Architecture topics are specific to the architecture domain for assessment. It incorporates aspects such as how an organization manages its as is architecture, as well as aligns towards a to-be state. This includes the processes for how an architecture group assesses and plans for future enterprise IT needs. Through this process the architecture group is evaluated to ensure the documentation of architecture plans and design objectives is properly accomplished within an organization. This includes planning for and managing technology obsolescence, end of life of technologies, and architectural shifts. The architecture must be flexible to account for these shifts in infrastructure and technology such that the implementation and operations groups can be successful. Additional topics include architecting for and managing shadow IT, and the design of the IT architecture expanding from in house and including virtualization and cloud strategies.
Some aspects the evaluators will look at include:
- The overall architecture, including current state, and whether it aligns to the enterprise wide business and strategic change
- A central repository for artifacts, and common schemas and terminology within an enterprise, and a maintenance plan in place. This would include blueprints, network diagrams, and topologies.
- A place to work with stakeholders on architectural changes, that include stakeholder analysis, and planning for diffusion of the technology in the enterprise (or rather, a high level plan that shows feasibility of implementation and operations)
- The following aspects should be included in the architectural design:
- Performance and reliability
- Availability and resilience
- Security and privacy
- Interoperability and integration
- Ability to integrate and align with one or more third-party service providers
- Testing internally and with third-party service providers, as appropriate
- Plans for obsolescence, EOL, and decommissioning of systems
Infrastructure topics include all aspects of managing an infrastructure to include people, process, and technology. Key processes that are evaluated include infrastructure management processes, ensuring supportive contractual arrangements, security, and change control processes. It also looks to ensure appropriate controls are in place to address planned scalability and interoperability of software, and associated software controls. In recent years there has been a growing focus on managing and maintaining open source software to include understanding function and validating integrity and provenance of open source software used. This means an organization needs to attest to the controls in place to validate the source of software used in an organization. Third party commercialized open source would push some of these requirements to the provider of the software, but the organization needs to ensure contractually that the provider is doing this.
Some additional aspects that are important in the infrastructure domain include:
- Intellectual property and talent management, ensuring that there are sufficient resources with infrastructure knowledge, skills, and expertise
- Mainframe infrastructure management
- Environmental and physical controls
Operations topics are the most extensively covered topics in the AIO guide. It is broken out into multiple domains including operational controls, operational processes, support and monitoring. This is because the execution of all transactions and activity is done through the day to day execution of the plan. While the selection of a lock is important, ensuring that it is actually being locked is critical and the FFIEC spends an appropriate amount of effort on ensuring the day to day activities are subject to a high level of rigor appropriate with the financial systems they support.
Banking and financial institutions have developed stronger operational controls as the core of their business is around managed risk. This expands beyond strictly technology assets but includes the facilities and access, as well as the personnel used to support the systems. A majority of security incidents are due to insider risk, which typically isn’t malfeasance but human mistakes or social engineering. The FFIEC does not overlook these areas and seeks to ensure appropriate controls are in place to reduce these risks. Areas of operational controls in the AIO guide include:
- Effective controls over the entity’s operating centers, including physical and logical controls.
- Defined and appropriately administered authorization boundaries containing the entity’s systems, software, and information.
- IAM methods used to appropriately identify and authenticate users
- Personnel controls (e.g., hiring and retention practices, maintaining appropriate skillsets and knowledge, and activity monitoring processes) to maintain an effective workforce
- Controls allowing for the use of personally owned devices
There is a fair amount of consideration that goes into operational processes for a financial institution, the regular care, maintaining, and hygiene of the critical infrastructure is fundamental to deterministic availability. The AIO guide spends an appropriate amount of focus in this area, specifically around the following domains:
- Appropriate preventive maintenance or operational restoration processes for equipment within the facilities that support the entity’s business objectives. Regular hygiene of critical infrastructure and can cover the gamut from changing air filters, to extensive maintenance.
- Configuration management processes such as how are configurations collected, maintained, and deployed
- Effective vulnerability and patch management processes. Being able to have a process in place to address vulnerabilities and apply patches to critical systems, or apply compensating controls, is critically important. This aspect goes hand in hand with managing end of life hardware and software that can no longer be kept in patch management.
- Backup and replication processes that facilitate recovery
- Scheduling processes to manage and effectively use IT resources (e.g., hardware and processing time). This can include prioritization of batch operations, such as ensuring quality of service is in place to ensure batched backups of systems is within tolerances defined by the regulatory bodies for RPO/RTO objectives.
- Capacity management processes that support the entity’s current and future strategic objectives.
- Log management processes that allow management to capture system, software, and physical access activities ensuring that logs exist for critical systems and are available
- Processes for the appropriate disposal of data and media
Service and Support processes
Service and support processes are evaluated to ensure adequate processes and controls are in place to adequately support the infrastructure. This includes ensuring effective planning around support of infrastructure as part of service management, to include SLA’s and contractual provisions. It goes on the evaluate the operational support processes, controls, and reporting mechanisms in place. It also includes provisions for event management, incident management, and problem management as well as the associated documentation to track these.
Ongoing monitoring and evaluation processes
Management should develop processes to oversee operations functions, evaluate the effectiveness of controls, and identify opportunities for improvement. In ITIL this is referred to as continual service improvement. There should be a feedback loop such that an organization has an adaptive immune system that can learn and improve over time. There should be processes in place such that operational controls develop over time, and also provide feedback to architecture for inclusion into architectural considerations.
Examiners are keyed to review for the following:
- Implementation of processes to monitor and report on control effectiveness
- Stakeholder input into the types of reports and metrics produced
- Defined objectives for IT, operations, and key performance indicators (KPI)
- KPIs that align with the entity’s ERM processes
- Processes for reporting KPIs to the board
- Implementation of corrective action plans when KPIs do not meet established targets
- Processes to recommend changes in operations processes and controls
- Strategies for service and process improvement and methods to measure the results of those improvement efforts
Cloud Computing: Cloud computing is a lengthy topic given the impacts on all aspects of architecture, infrastructure, and operations. Within IT organizations the concept is well understood and includes private, public, and hybrid cloud. There are multiple considerations to take into account to include shared responsibility, ensuring data controls are accounted for, and access controls are in place.
Zero Trust Architecture: Zero Trust is a shift from traditional “Castle style” security architectures where there was an internal trusted zone and a focus on a strong perimeter. The challenge with his model is similar to traditional castles, once the plague gets in, everything was exposed to lateral movement. Through not inherently trusting devices within the perimeter and treating them as untrusted by default, stronger controls can be put into place to contain and constrain a security incident. These include domains like segmentation and contextual access control.
Microservices: Microservices and container based architectures can offer IT benefits but create a number of challenges in managing. Some of these challenges include the sheer number of services and interdependencies/interconnectedness and difficulty in defining the perimeter, coupled with limitations in existing controls leads the FFIEC to define that all microservices should be treated as non-trustworthy at this time.
Artificial Intelligence and Machine Learning: There are a lot of capabilities in the data driven world for which computational heuristics and advanced systems must be used simply to manage the sheer amount of data generated. For this machine learning and artificial intelligence have shown unique strengths in helping financial systems determine fraudulent behavior and cybersecurity risk identification. However, there are concerns mostly around handing and managing the vast amount of data stored today, and access to it. There is also challenges with a lack of transparency to some users in the process, and the risk of false positives (and negatives), that institutions must address.
Internet of Things: This section applies to IoT based devices which include the proliferation of building management systems. These will become more pronounced and operationally required to meet upcoming sustainability requirements governments are adopted. There are numerous different risk challenges with these devices to include the software stack and lack of existing IT controls intrinsic and patching concerns. There is a proliferation in type, access method, OS version and cloud requirements that result in a segmentation requirement for protecting core IT infrastructure.
The FFIEC AIO guide is well documented and there is guidance for evaluators which an organization can be aware of when evaluating their internal systems. At 100 pages, but adequately classified into sections, it is to exhaustive for every person in leadership to understand the nuances. The goals of this blog were to provide a high level view of the guidance, and a reference for leadership to get additional information if there are any questions. The next blog will discuss some technologies that can help scale and address some of these challenges.