AICPA SOC Service Organizations - Certrec

Arizona Public Service Company received $132,000 Penalty for Self-reported Violation of TOP-001-4 R13

Summary of NERC Penalties

REGION 

WHEN?

ENTITY

REASON

VIOLATIONS

COMPLIANCE AREA

COMPLIANCE AREA


PENALTY AMOUNT

NERC

Quarter

4/17/19

Arizona Public Service Company

Lack of coordinated efforts in AZPS’s incident response

Transmission Operator (TOP) Standard

Transmission Operator (TOP) Standard


TOP-001-4, R13

$132k

Each Transmission Operator shall have, and make available upon request, evidence to show it ensured that a Real-Time Assessment was performed at least once every 30 minutes. This evidence could include but is not limited to dated computer logs showing times the assessment was conducted, dated checklists, or other evidence (R13). Arizona Public Service Company (AZPS) received a $132,000 Penalty for a Self-reported Violation of TOP-001-4, “Transmission Operations”, R13. Due to configuration issues and uncoordinated troubleshooting efforts, AZPS’s Energy Management System (EMS) lost visibility to the remote thermal unit (RTU) and Inter-Control Center Communications Protocol (ICCP) data between AZPS and its Reliability Coordinator (RC) and neighboring entities. The data lapse impacted the performance of the Real-time Contingency Analysis (RTCA) tool which stopped solving due to the loss of the RTU data. The RTCA is an aspect of AZPS’s Real Time Assessment (RTA), and AZPS System Operators (SOs) and RC lost the ability to perform a manual RTA in accordance with its procedures. TOP-001 Requirement R13 (as it currently exists) was in response to a Notice of Proposed Rule Making (NOPR) concerning Real-time analysis responsibilities for Transmission Operators. The Transmission Operator’s Operating Plan must describe how to perform the Real-time Assessment. The Operating Plan should contain instructions as to how to perform Operational Planning Analysis and Real-time Assessment with detailed instructions and timing requirements as to how to adapt to conditions where processes, procedures, and automated software systems are not available (if used). This could include instructions such as an indication that no actions may be required if system conditions have not changed significantly and that previous Contingency analysis or Real-time Assessments may be used in such a situation.

Description

On April 16, 2019, AZPS replaced a pair of non-CIP Energy Management System (EMS) frame firewalls as part of a planned outage change at the primary Control Center (PCC). Due to configuration issues, and uncoordinated troubleshooting efforts, on April 17, 2019, at 7:38 p.m., the EMS lost visibility to remote thermal unit (RTU) and Inter-Control Center Communications Protocol (ICCP) data between AZPS and its Reliability Coordinator (RC) and neighboring entities. The data lapse impacted the performance of the Real-time Contingency Analysis (RTCA) tool which stopped solving due to the loss of the RTU data. The RTCA is an aspect of AZPS’s Real Time Assessment (RTA), and AZPS System Operators (SOs) and RC lost the ability to perform a manual RTA in accordance with its procedures. Thus, AZPS was unable to perform a Real-time assessment (RTA) which required an examination of existing and expected system conditions, conducted by collecting and reviewing immediately available data.

In addition, AZPS notified its neighboring entities of the condition and specifically confirmed the loss of the ICCP data with its RC and other neighboring entities. AZPS’s Balancing Authority (BA) function notified its generation units to change to local control and maintain their current output. Because the SOs did not have visibility of the system including RTCA, the PCC monitored the system in Real-time via field personnel and managed the BA responsibilities through manual checkout every 15 minutes along with adjacent BAs.

AZPS activated its CIP-009-6 Recovery Plan and identified lessons learned and updated the Recovery Plan to include a peer check. AZPS completed the required steps for CIP-009-6 R2 and R3, thus CIP-009-6 R1-R3 were not at issue. AZPS used a CIP head end firewall as an attempt to troubleshoot during the instant violation, but the failover did not require Change Management protocols. Therefore, WECC determined that AZPS did not have any additional instances of potential noncompliance with CIP-009-6 R1-R3 nor CIP-010-2 R1 associated with the instant violation.

At 8:21 p.m., AZPS restored connectivity between the router and the new EMS frame firewalls. At 9:44 p..m., the EMS regained visibility and the RTCA began solving with the return of the RTU data. At 10:30 p.m., the ICCP services were restored for EMS, improving the functionality of the RTCA solution.

The EMS outage began at 7:38 p.m. and ended at 9:44 p.m., for a total of two hours and six minutes. The ICCP outage began at 7:38 PM and ended at 10:29 p.m. for a total of two hours and 51 minutes. The RTCA did not solve during the EMS outage of 2 hours and six minutes and the RTCA results were not verified during the ICCP outage of 2 hours and 51 minutes.      

The root cause of the violation was attributed to a lack of coordinated efforts in AZPS’s incident response when it did not follow established decision trees and its incident management procedure, which resulted in an EMS outage, ICCP outage and RTCA not solving.

The violation began on April 17, 2019 at 8:09 p.m., when AZPS did not ensure that an RTA was performed through its RTCA, and ended on April 17, 2019 at 11:35 p.m., when AZPS was able to perform an RTA, for a total of three hours and 26 minutes (206 minutes over the 30-minute requirement).

Mitigation

To mitigate this violation, AZPS has performed the following actions:

  • restored its data capabilities and ensured AZPS’ ability to perform a successful RTA;
  • held a Stand Down on the IT Computer Event Response Plan and the new IT End User communication plan to reinforce staff expectations which included the implementation of a Standing Order that all infrastructure changes to “Mission / Business / Safety critical systems must have change command;”
  • conducted formal cross-training between all parties involved in supporting the EMS application;
  • created an asset change management process to include:
    • verify and document for an in-service device, its current function and configuration;
    • map the existing configuration for the in-service device with the replacing device
    • develop and/or update applicable System Test Plan and Test cases to meet the new device requirements and exercise of intended functionality
    • identify the monitoring and alerting functionality for the new device
    • quality assurance process for all key tasks associated with this process and procedure(s)
  • moved outdated host files out of the primary directory; and
  • developed a procedure for when host files are to be pushed, which requires:
    • a peer check be performed within the Real-Time System Support team
    • a peer check be performed with the EMS Duty Engineer.
    • once the push is complete, the RTSS validates push is successful
    • EMS Duty Engineer validates push is successful.

About Certrec:
Certrec is a leading provider of regulatory compliance solutions for the energy industry with the mission of helping ensure a stable, reliable, bulk electric supply. Since 1988, Certrec’s SaaS applications and consulting expertise have helped hundreds of power-generating facilities manage their regulatory compliance and reduce their risks.

Certrec’s engineers and business teams bring a cumulative 1,500 years of working experience in regulatory areas of compliance, engineering, and operations, including nuclear, fossil, solar, wind facilities, and other Registered Entities generation and transmission.

Certrec has helped more than 200 generating facilities establish and maintain NERC Compliance Programs. We manage the entire NERC compliance program for 80+ registered entities in the US, Canada, and Mexico that trust us to decrease their regulatory and reputational risk. Certrec is ISO/IEC 27001:2013 certified and has successfully completed annual SOC 2 Type 2 examinations.

For press and media inquiries, please contact marketing@certrec.com.

Share