Handling incidents

Best practices for incident detection and response.

The information in this page is for guidance only. It is not a complete list of all security measures you should take, and should not be taken as definitive advice.

To prepare for and respond to security incidents, you should implement:

Effective monitoring and logging systems, to collect extensive logs of system events, network traffic, and user activities, and to detect security events in real time.
Workflows to alert and respond to security incidents, including thorough post-incident investigations and reviews.

Requirements

Before you begin, check if the information on this page applies to you.

Requirement	Description
Integration type	The information on this page is relevant for all Adyen integrations.

Monitoring and logging

A monitoring and logging system should detect security events in real time, enable a quick response to any security incidents, and offer a comprehensive forensic analysis of incidents. Security events should be detected across the entire IT environment.

For guidelines, see the monitoring and logging resources of
NIST 800-94, ISO/IEC 27001, and PCI-DSS.

Comprehensive logging and data collection

Logging is the cornerstone of security monitoring. Best practices are to collect extensive logs of system events, network traffic, and user activities, making sure to log both successful and unsuccessful attempts to gain access. This is especially
relevant when critical assets like databases, apps, and admin accounts are involved.

Some important aspects of an efficient logging strategy are:

Detailed audit logs: collect logs from applications, network devices, databases, and system events. In other words, collect logs from all layers. Make sure the logs include user authentication attempts, privilege escalations, configuration changes, and abnormal data access patterns.
Centralized log management: use a solution such as a Security Information and Event Management (SIEM) system to consolidate logs from various sources. A SIEM system simplifies the analysis and also ensures the logs are secure and remain available for later use.

Log management also means encrypting the logs both in transit and at rest. Make sure the SIEM system can work with encrypted logs.
Event correlation and analysis: to identify complex attack patterns or persistent threats, you need to correlate log data across multiple systems. A SIEM system can help with that. It integrates log data from intrusion detection systems, firewalls, and user behavior analytics, and thus provides real-time insights into potential threats.

Real-time monitoring and intrusion detection

In addition to logging, real-time monitoring plays a crucial role in the early detection of security incidents. You can use an Intrusion Detection and Prevention System (IDPS) to monitor network traffic and host-based events, and flag suspicious behavior. In some cases, an IDPS can also automatically prevent attacks.

Some important aspects of an IDPS-based monitoring strategy are:

Diverse detection capabilities: for full coverage, you should combine a network-based IDPS with a host-based IDPS. A network-based IDPS monitors traffic and flags unusual patterns, such as port scanning or unauthorized access attempts. A host-based IDPS detects unusual events at the system level, such as unauthorized file changes or abnormal processes.
Anomaly- and signature-based detection: a combination of flagging unusual activities and identifying known threats using predefined "signatures".
Automation and alerting: automated alerts for critical incidents, prioritized by severity. Automated preventive measures in response to critical incidents should also be part of this.

Log management and retention for compliance

To ensure compliance with regulations and to make incident response and post-incident investigation easier, good log management is vital.

Some important aspects of log management are:

Retention policies: logs must be retained in accordance with legal, regulatory, and operational requirements. PCI-DSS suggests retaining logs for at least a year. Regular archiving ensures older logs remain accessible.
Secure storage: log data should be protected from data loss and tampering through encryption, access controls, and regular backups. The backup logs should be stored separately and securely.
Continuous monitoring of log integrity: by applying checksums or other cryptographic methods you can regularly verify that the logs have not been tampered with.

Integration of SANS monitoring guidelines

The SysAdmin, Audit, Network, Security (SANS) Institute recommends a layered approach to continuous monitoring that involves not just known attack paths but also unusual behaviors that can indicate a security breach.

Some important SANS recommendations are:

Comprehensive asset coverage: logging and monitoring should cover all critical assets, including assets in the cloud and third-party services.
Behavioral baselines: these help to identify anomalies that do not match known attack methods or signatures.
Proactive incident response: integrating automation with regard to responses and workflows into the monitoring systems helps security teams respond to incidents in real-time.

Alerting and incident response

To mitigate the impact of security incidents, it is vital to detect, prioritize, and respond to alerts. Building effective alerting mechanisms and response workflows requires a combination of automation, predefined processes, and expertise.

Setting up an effective alerting system

Alerting is the bridge between detection and response. To avoid "alert fatigue" your system should also prioritize alerts.

Some important aspects of an effective alerting system are:

Automated threat detection: the SIEM and IDPS systems that have been mentioned before are instrumental in generating automated alerts based on predefined thresholds. These systems filter out false positives and focus on high-fidelity alerts that require immediate investigation.
Risk-based prioritization: not all threats warrant the same level of response. According to ISO/IEC 27001 you should define risk categories for assets, for example, based on attack vector, system vulnerability, and potential business impact. This allows you to rank threats and ensure that high-priority incidents are escalated first.
Real-time notifications: make sure alerts are delivered through multiple channels, like email, SMS, or integrated communication tools. For high-severity cases, you can set up alerts to trigger immediate automatic containment actions such as blocking IP addresses or isolating network segments.

Developing incident response workflows

An incident response workflow is a series of coordinated actions to contain, eradicate, and recover from security incidents. You should set up, document, and test such a workflow.

Some important aspects of an incident response workflow are:

Preparation: this involves developing and maintaining an incident response plan (IRP), training staff, and establishing communication channels. The plan should outline the roles and responsibilities of the key stakeholders including the incident response team (IRT). The IRT can consist of representatives from IT, legal, compliance, and business units, and must be equipped with forensic analysis software, containment mechanisms, and other necessary tools.
To minimize confusion during high-stress incidents, make sure your plan includes playbooks, which are step-by-step guides to handle specific types of incidents. You should test these playbooks regularly.
Detection and analysis: after an alert is triggered, you need to determine if there is an actual security incident. Complex threats like advanced persistent threats can require deep analysis of event logs, traffic patterns, and system behavior.
If an incident is confirmed, it must be categorized in terms of severity and possible impact on business operations. Such a categorization helps in activating the correct response protocols and escalating when necessary.
Containment, eradication, and recovery: you must contain a confirmed incident as fast as possible, to prevent further damage.
First, you take temporary containment measures, such as blocking malicious IP addresses or compromised user accounts. Then, you identify and fix the root cause. This can involve patching vulnerabilities, removing malware, or deactivating accounts. And finally, you restore the systems to normal operation. You should do this in phases, while monitoring for any lingering threats.
Post-incident review and lessons learned: after an incident is resolved, you need to thoroughly review it for ways to improve your incident response workflow and training programs.
All stakeholders should be involved in a formal debriefing of what worked, what did not work, and how the incident could have been handled better. This enables you to refine the playbooks and the detection systems.
If there are any follow-up actions, these should be formally assigned to teams, to ensure accountability for addressing all aspects of the incident.
You should also make sure that timelines, actions taken, and outcomes related to the incident are documented, for sharing with the key stakeholder, compliance, and future reference.

Automating incident response

While human oversight remains essential, automating certain incident response aspects enhances your ability to react to threats in real-time.

Opportunities or automation include:

Automated playbook execution: for routine incidents, automated playbooks can initiate predefined containment actions and notify relevant personnel. This reduces both the pressure on the security teams, and the likelihood of human error in the early stages of incident response.
Incident escalation and collaboration tools: you can use an automated system to prioritize incidents and route them to the correct response team. Ideally, the whole incident response flow is integrated with collaboration tools so that all stakeholders are informed and aligned. For this, you can use an incident management platform. Such platforms track the progress of the response and provide real-time updates.