Web Application Development
Date Range Validation in Web Applications
Problem
Applications requiring date range inputs need validation to ensure dates are valid, properly formatted, and logically consistent. Users may enter invalid dates, dates in wrong formats, or dates that create logical conflicts (like an end date before a start date). Without validation, corrupted data enters the database causing reporting errors and operational problems.
Solution
Implement both client-side and server-side validation. Client-side validation provides immediate feedback using dynamic actions that check date format and logical relationships as users type. Server-side validation using PL/SQL expressions ensures data integrity even if client-side checks are bypassed. Use APEX's built-in date picker components with format masks, minimum/maximum date constraints, and validation error messages that clearly explain requirements.
Implementation
Server-Side Validation:
-- PL/SQL Expression Validation
:P_END_DATE >= :P_START_DATE
Client-Side Dynamic Action: Set minimum date on end date picker when start date changes using JavaScript expression and Set Value action.
Admin Override for Restricted Operations
Problem
Systems typically restrict certain operations when capacity is reached or deadlines pass. However, administrators occasionally need to perform these restricted operations for legitimate exceptions—late entries, special accommodations, or administrative corrections. Without override capability, administrators must manually manipulate database records, creating security risks and data integrity issues.
Solution
Implement role-based authorization that allows administrators to bypass operational restrictions while maintaining audit trails. Create conditional logic that checks both operation status and user role before blocking actions. Use APEX authorization schemes to show admin-only buttons or form fields. Add database triggers or application logic to log all override actions with administrator ID and timestamp for accountability.
Implementation
Authorization Scheme:
-- Create authorization scheme checking user role
SELECT COUNT(*)
FROM user_roles
WHERE username = :APP_USER
AND role_name = 'ADMIN'
Conditional Operation Logic:
-- Allow operation if status open OR user is admin
IF (record_status = 'OPEN' AND current_count < max_capacity)
OR is_admin(:APP_USER) THEN
-- Process operation
INSERT INTO operations...
-- Log override if admin bypassed restrictions
IF record_status = 'CLOSED' THEN
INSERT INTO admin_override_log...
END IF;
END IF;
UI Element Authorization: Apply authorization scheme to admin-only buttons and form regions using Server-side Condition type "Authorization Scheme".
Duplicate Entry Prevention
Problem
Without duplicate detection, users can submit the same record multiple times, either accidentally through multiple form submissions or intentionally. This creates inaccurate counts, wastes limited capacity, complicates tracking, and generates data inconsistencies in reporting. Manual cleanup of duplicate entries is time-consuming and error-prone.
Solution
Implement database-level unique constraints combined with application-level validation checks. Create a composite unique constraint on the table using relevant ID columns. Add pre-insert validation in APEX that queries existing records before allowing submission. Provide clear error messaging when duplicates are detected. Use database triggers as an additional enforcement layer to prevent duplicates even if application logic is bypassed.
Implementation
Database Unique Constraint:
ALTER TABLE user_operations
ADD CONSTRAINT uk_user_operation
UNIQUE (user_id, operation_id);
APEX Validation:
-- PL/SQL Function Body returning Boolean
DECLARE
v_count NUMBER;
BEGIN
SELECT COUNT(*)
INTO v_count
FROM user_operations
WHERE user_id = :P_USER_ID
AND operation_id = :P_OPERATION_ID;
RETURN v_count = 0;
END;
Error Message: "This record already exists in the system."
Status Tracking with Checkbox Column
Problem
Administrators need a simple method to record status changes for tracked records. Manual status tracking through separate forms or spreadsheets is time-consuming and creates data synchronization issues. Without integrated tracking, generating status reports requires manual data reconciliation, and historical records may be incomplete or inaccurate.
Solution
Add a status boolean column to the tracking table. Implement an interactive grid or report in APEX that displays all records with an editable checkbox column for status. Use APEX's built-in row-level DML operations to update status records directly from the grid. Restrict checkbox editing to administrators using authorization schemes. Include timestamp and user tracking columns to maintain an audit trail of status modifications.
Implementation
Add Status Column:
ALTER TABLE operations
ADD (
completed VARCHAR2(1) DEFAULT 'N',
completed_by VARCHAR2(100),
completed_date TIMESTAMP
);
-- Add check constraint
ALTER TABLE operations
ADD CONSTRAINT chk_completed CHECK (completed IN ('Y', 'N'));
Interactive Grid Configuration:
- Source: SELECT user_name, operation_name, completed FROM operations
- Completed column: Display as Checkbox, editable
- Enable Save button for DML operations
- Authorization: Admin role only
Audit Trail Trigger:
CREATE OR REPLACE TRIGGER trg_status_audit
BEFORE UPDATE OF completed ON operations
FOR EACH ROW
BEGIN
IF :NEW.completed = 'Y' AND :OLD.completed = 'N' THEN
:NEW.completed_by := V('APP_USER');
:NEW.completed_date := SYSTIMESTAMP;
END IF;
END;
Modal Dialog Implementation with Date Validation
Problem
Data entry workflows often require multiple input fields that clutter the main page interface. Presenting all creation fields inline reduces usability and makes the interface overwhelming. Additionally, users need immediate feedback when entering date ranges to prevent invalid data submission. Without modal dialogs, form validation errors require full page refreshes, disrupting the user workflow.
Solution
Implement APEX modal dialog pages for data entry that open in an overlay window without navigating away from the main page. Configure the modal to include date pickers with built-in validation. Use dynamic actions to enforce date logic client-side while the modal is open. Handle modal closure events to refresh the parent page's data list. Use APEX's dialog closed dynamic action to detect successful submissions and update the underlying data without full page reload.
Implementation
Create Modal Dialog Page:
- Page Mode: Modal Dialog
- Add items: P_RECORD_NAME, P_START_DATE, P_END_DATE
- Add Process: Insert record
- Add Branch: Close dialog on success
Open Modal Button (Parent Page):
-- Button action: Redirect to Page
-- Target: Page in this application
-- Page: [modal page number]
-- Clear Cache: [modal page number]
Dynamic Action on Start Date Change:
-- Event: Change
-- Selection Type: Item
-- Item: P_START_DATE
-- Action: Set Value
-- Set Type: JavaScript Expression
-- JavaScript: $v('P_START_DATE')
-- Affected Element: P_END_DATE
-- Attribute: Minimum Value
Dialog Closed Event (Parent Page):
-- Event: Dialog Closed
-- Action: Refresh region containing data list
Dynamic Actions to Prevent End Date Before Start Date
Problem
Date range inputs require real-time validation to prevent logical errors. Users may select an end date that precedes the start date either through manual entry or date picker selection. Without immediate client-side feedback, users only discover the error after form submission, forcing them to correct and resubmit. This degrades user experience and increases support requests. Additionally, client-side validation alone is insufficient—malicious users can bypass browser controls.
Solution
Implement layered validation using APEX dynamic actions for immediate client-side feedback combined with server-side PL/SQL validation for security. Create a Change event dynamic action on the start date item that updates the end date picker's minimum allowable value. Add a second dynamic action on the end date item that displays inline error messages when an invalid date is selected. Complement these with page-level PL/SQL validation that executes during form submission, preventing invalid data from reaching the database regardless of client-side manipulation.
Implementation
Dynamic Action 1: Update End Date Minimum
Event: Change [P_START_DATE]
True Action: Set Value
Set Type: JavaScript Expression
JavaScript: $v('P_START_DATE')
Affected Element: P_END_DATE
Attribute to Set: Minimum Value
Dynamic Action 2: Validate End Date
Event: Change [P_END_DATE]
Condition: JavaScript Expression
JavaScript: new Date($v('P_END_DATE')) < new Date($v('P_START_DATE'))
True Action: Show error message
Message: "End date cannot be before start date"
False Action: Clear error message
Server-Side Validation (Required):
-- Validation Type: PL/SQL Expression
-- Expression: :P_END_DATE >= :P_START_DATE
-- Error Message: "End date must be on or after start date"
-- Validation Location: On Submit - After Computations and Validations
Server-Side Validation for Date Logic
Problem
Client-side validation can be bypassed by disabling JavaScript, manipulating browser developer tools, or submitting data through API calls. Relying solely on client-side validation creates security vulnerabilities where invalid or malicious data reaches the database. Date logic violations—such as end dates before start dates, past dates for future events, or dates outside acceptable ranges—must be caught server-side to maintain data integrity regardless of how the data is submitted.
Solution
Implement comprehensive server-side validation using PL/SQL validation rules in APEX. Create multiple validation points that check date ranges, date ordering, and business logic constraints. Use PL/SQL expressions that return boolean values or validation functions that raise exceptions with specific error messages. Position validations to execute after computations but before DML operations. Ensure validation error messages clearly indicate the problem and required correction. Log validation failures for security monitoring and audit purposes.
Implementation
Basic Date Range Validation:
-- Validation Type: PL/SQL Expression
-- PL/SQL Expression:
:P_END_DATE >= :P_START_DATE
-- Error Message: "End date must be on or after start date"
-- Validation Execution Scope: On Submit - After Computations and Validations
Future Date Validation:
-- Ensure start date is not in the past
:P_START_DATE >= TRUNC(SYSDATE)
-- Error Message: "Start date cannot be in the past"
Complex Validation Function:
-- PL/SQL Function Body returning Boolean
DECLARE
v_valid BOOLEAN := TRUE;
BEGIN
-- Check date order
IF :P_END_DATE < :P_START_DATE THEN
v_valid := FALSE;
END IF;
-- Check minimum duration (at least 1 day)
IF (:P_END_DATE - :P_START_DATE) < 1 THEN
v_valid := FALSE;
END IF;
-- Check maximum advance scheduling (not more than 1 year out)
IF :P_START_DATE > ADD_MONTHS(SYSDATE, 12) THEN
v_valid := FALSE;
END IF;
RETURN v_valid;
END;
Validation with Audit Logging:
CREATE OR REPLACE PROCEDURE log_validation_failure(
p_user VARCHAR2,
p_error VARCHAR2
) AS
PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
INSERT INTO validation_audit_log
VALUES (SYSDATE, p_user, p_error);
COMMIT;
END;
Lookup Table Pattern Instead of Hardcoded CASE Statements
Problem
Hardcoding mappings in application logic using CASE statements creates maintenance overhead and requires code deployments for simple data changes. When new values are added or mapping conventions change, developers must modify PL/SQL code, test the changes, and deploy updates to production. This tight coupling between data and code slows operational agility, increases deployment risk, and prevents administrators from managing mappings independently through the application interface.
Solution
Create a dedicated lookup table that stores mappings as data rather than code. Replace CASE statement logic with simple SELECT queries against this table. Build an administrative interface in APEX that allows authorized users to add, modify, or remove mappings without code changes. Implement proper constraints and validation to ensure data integrity. This data-driven approach separates configuration from logic, enables self-service administration, and eliminates code deployments for routine mapping updates.
Implementation
Create Lookup Table:
CREATE TABLE value_mappings (
source_value VARCHAR2(100) PRIMARY KEY,
mapped_value VARCHAR2(100) NOT NULL,
active VARCHAR2(1) DEFAULT 'Y' CHECK (active IN ('Y','N')),
created_date TIMESTAMP DEFAULT SYSTIMESTAMP,
created_by VARCHAR2(100),
modified_date TIMESTAMP,
modified_by VARCHAR2(100)
);
Replace CASE Statement Logic:
-- Old approach with CASE
v_result := CASE :P_INPUT_VALUE
WHEN 'Value A' THEN 'RESULT_A'
WHEN 'Value B' THEN 'RESULT_B'
ELSE NULL
END;
-- New approach with lookup table
BEGIN
SELECT mapped_value INTO v_result
FROM value_mappings
WHERE source_value = :P_INPUT_VALUE
AND active = 'Y';
EXCEPTION
WHEN NO_DATA_FOUND THEN
RAISE_APPLICATION_ERROR(-20001, 'No mapping found for input value');
END;
Admin Maintenance Page:
- Create Interactive Grid on value_mappings table
- Enable inline editing for mapped_value column
- Add authorization scheme: Admin role only
- Include audit columns: created_by, created_date, modified_by, modified_date
Audit Trigger:
CREATE OR REPLACE TRIGGER trg_mapping_audit
BEFORE INSERT OR UPDATE ON value_mappings
FOR EACH ROW
BEGIN
IF INSERTING THEN
:NEW.created_by := V('APP_USER');
:NEW.created_date := SYSTIMESTAMP;
ELSIF UPDATING THEN
:NEW.modified_by := V('APP_USER');
:NEW.modified_date := SYSTIMESTAMP;
END IF;
END;
Security & Authentication
Oracle Certificate Import for OIDC Authentication
Problem
Oracle APEX applications using OpenID Connect (OIDC) for authentication must validate SSL/TLS certificates when making HTTPS requests to identity providers. Without proper certificate chain validation, APEX cannot establish secure connections to authentication endpoints. The UTL_HTTP package used by APEX for OIDC communication requires trusted certificates in an Oracle Wallet. Missing or incorrectly configured certificates cause authentication failures with "ORA-29024: Certificate validation failure" or similar errors, blocking all user logins.
Solution
Import the complete certificate chain for the OIDC provider into an Oracle Wallet and configure APEX to use it for HTTPS connections. Download the root and intermediate certificates from the identity provider's server. Use Oracle Wallet Manager or orapki command-line utility to create a wallet and import certificates. Configure sqlnet.ora to point to the wallet location and enable SSL_SERVER_DN_MATCH for hostname verification. Update APEX instance settings to use the wallet for outbound HTTPS connections.
Implementation
Download Certificate Chain:
-- Using OpenSSL to get certificate chain
openssl s_client -connect auth.provider.com:443 -showcerts
-- Export each certificate from the output to separate .pem files:
-- root_ca.pem, intermediate_ca.pem
Create Oracle Wallet:
-- Create wallet directory
mkdir -p /path/to/wallet
-- Create wallet (auto_login for automatic opening)
orapki wallet create -wallet /path/to/wallet -auto_login -pwd WalletPassword123
-- Add certificates to wallet
orapki wallet add -wallet /path/to/wallet -trusted_cert -cert root_ca.pem -pwd WalletPassword123
orapki wallet add -wallet /path/to/wallet -trusted_cert -cert intermediate_ca.pem -pwd WalletPassword123
-- Verify certificates in wallet
orapki wallet display -wallet /path/to/wallet
Configure sqlnet.ora:
WALLET_LOCATION =
(SOURCE =
(METHOD = FILE)
(METHOD_DATA =
(DIRECTORY = /path/to/wallet)
)
)
SSL_CLIENT_AUTHENTICATION = FALSE
SSL_SERVER_DN_MATCH = TRUE
Configure APEX for Wallet:
-- Set wallet path in APEX instance
BEGIN
APEX_INSTANCE_ADMIN.SET_PARAMETER(
p_parameter => 'WALLET_PATH',
p_value => 'file:/path/to/wallet'
);
COMMIT;
END;
-- Test HTTPS connection
BEGIN
APEX_WEB_SERVICE.MAKE_REST_REQUEST(
p_url => 'https://auth.provider.com/.well-known/openid-configuration',
p_http_method => 'GET'
);
END;
Trust Store Configuration (sqlnet.ora) for Certificate Validation
Problem
Oracle database connections using SSL/TLS require proper trust store configuration to validate server certificates. When sqlnet.ora is misconfigured or missing trust store parameters, database clients cannot verify certificate chains, leading to either connection failures or insecure connections that skip validation. Applications relying on UTL_HTTP, UTL_TCP, or database links for external HTTPS calls fail with certificate validation errors. Without proper trust store configuration, organizations cannot enforce certificate-based security policies or prevent man-in-the-middle attacks.
Solution
Configure sqlnet.ora with proper WALLET_LOCATION, SSL_CLIENT_AUTHENTICATION, and SSL_SERVER_DN_MATCH parameters. Define the trust store directory containing Oracle Wallet files with trusted root and intermediate certificates. Enable server certificate validation and hostname verification to ensure connections only succeed when certificates are valid and match the target server. Set appropriate cipher suites and TLS versions to meet security requirements. Test configuration with both database connections and UTL_HTTP calls to verify proper certificate validation.
Implementation
Basic sqlnet.ora Configuration:
# Wallet location for trusted certificates
WALLET_LOCATION =
(SOURCE =
(METHOD = FILE)
(METHOD_DATA =
(DIRECTORY = /u01/app/oracle/admin/wallet)
)
)
# Client authentication not required for trust store
SSL_CLIENT_AUTHENTICATION = FALSE
# Enable hostname verification (recommended)
SSL_SERVER_DN_MATCH = TRUE
# Specify TLS version (1.2 minimum recommended)
SSL_VERSION = 1.2
# Cipher suites (optional - restricts to secure ciphers)
SSL_CIPHER_SUITES =
(TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256)
Test Certificate Validation:
-- Test UTL_HTTP with HTTPS endpoint
DECLARE
l_response UTL_HTTP.RESP;
l_text VARCHAR2(32767);
BEGIN
UTL_HTTP.SET_WALLET('file:/u01/app/oracle/admin/wallet');
l_response := UTL_HTTP.BEGIN_REQUEST(
url => 'https://www.example.com',
method => 'GET'
);
UTL_HTTP.READ_TEXT(l_response, l_text);
UTL_HTTP.END_RESPONSE(l_response);
DBMS_OUTPUT.PUT_LINE('Success: ' || SUBSTR(l_text, 1, 100));
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('Error: ' || SQLERRM);
UTL_HTTP.END_RESPONSE(l_response);
END;
Verify Wallet Contents:
-- Display trusted certificates in wallet
orapki wallet display -wallet /u01/app/oracle/admin/wallet
-- Expected output shows:
-- Trusted Certificates:
-- Subject: CN=DigiCert Global Root CA
-- Subject: CN=Let's Encrypt Authority X3
Common Configuration Issues:
- Incorrect DIRECTORY path in WALLET_LOCATION
- Missing cwallet.sso file in wallet directory
- SSL_SERVER_DN_MATCH=TRUE with certificate CN mismatch
- Expired or incomplete certificate chain in wallet
- File permissions preventing Oracle process from reading wallet
Production Best Practices:
# Enable detailed SSL debugging (remove in production)
TRACE_LEVEL_CLIENT = 16
TRACE_DIRECTORY_CLIENT = /u01/app/oracle/admin/trace
# Network encryption (optional but recommended)
SQLNET.ENCRYPTION_CLIENT = REQUIRED
SQLNET.CRYPTO_CHECKSUM_CLIENT = REQUIRED
Custom Authentication Schemes in APEX
Problem
Default APEX authentication schemes (Application Express Accounts, Database Accounts) don't integrate with enterprise identity management systems or support custom authentication workflows. Organizations using LDAP, Active Directory, SAML, or custom authentication APIs cannot authenticate users without building custom integration logic. Without proper authentication scheme implementation, applications either operate with insecure workarounds or require manual user account management that doesn't sync with enterprise identity sources. Session management, user attribute propagation, and single sign-on capabilities remain unavailable.
Solution
Create custom authentication schemes in APEX that integrate with external identity providers through PL/SQL authentication functions. Implement the authentication function interface that APEX expects, including credential validation, session establishment, and logout handling. Use UTL_HTTP or web service calls to validate credentials against external systems. Store minimal session data in APEX session state while retrieving user attributes from the identity provider. Configure post-authentication procedures to populate application-specific user context. Implement proper error handling for authentication failures and timeout scenarios.
Implementation
Create Authentication Function:
CREATE OR REPLACE FUNCTION custom_auth_function (
p_username IN VARCHAR2,
p_password IN VARCHAR2
) RETURN BOOLEAN
IS
l_valid BOOLEAN := FALSE;
l_response CLOB;
BEGIN
-- Call external authentication API
l_response := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
p_url => 'https://auth.company.com/validate',
p_http_method => 'POST',
p_body => JSON_OBJECT(
'username' VALUE p_username,
'password' VALUE p_password
)
);
-- Parse response
IF JSON_VALUE(l_response, '$.authenticated') = 'true' THEN
l_valid := TRUE;
-- Store user attributes in session
APEX_UTIL.SET_SESSION_STATE(
p_name => 'USER_EMAIL',
p_value => JSON_VALUE(l_response, '$.email')
);
APEX_UTIL.SET_SESSION_STATE(
p_name => 'USER_ROLE',
p_value => JSON_VALUE(l_response, '$.role')
);
END IF;
RETURN l_valid;
EXCEPTION
WHEN OTHERS THEN
APEX_DEBUG.ERROR('Authentication error: ' || SQLERRM);
RETURN FALSE;
END custom_auth_function;
Configure Authentication Scheme in APEX:
- Navigate to: Shared Components → Authentication Schemes
- Create → Based on a pre-configured scheme → Custom
- Authentication Function: custom_auth_function
- Session Not Valid: Redirect to login page
- Session Valid: Use current session
Post-Authentication Procedure:
CREATE OR REPLACE PROCEDURE post_auth_procedure
IS
l_user_id NUMBER;
BEGIN
-- Log authentication event
INSERT INTO auth_audit_log (
username,
auth_time,
ip_address
) VALUES (
:APP_USER,
SYSTIMESTAMP,
OWA_UTIL.GET_CGI_ENV('REMOTE_ADDR')
);
-- Load user preferences
SELECT user_id INTO l_user_id
FROM app_users
WHERE username = :APP_USER;
APEX_UTIL.SET_SESSION_STATE('USER_ID', l_user_id);
COMMIT;
END post_auth_procedure;
Logout URL Configuration:
-- Logout URL in authentication scheme settings
&APP_URL./f?p=&APP_ID.:LOGOUT:&SESSION.
-- Custom logout procedure to clear external session
CREATE OR REPLACE PROCEDURE custom_logout
IS
BEGIN
-- Call external logout endpoint
APEX_WEB_SERVICE.MAKE_REST_REQUEST(
p_url => 'https://auth.company.com/logout',
p_http_method => 'POST',
p_body => JSON_OBJECT('session_id' VALUE V('APP_SESSION'))
);
-- Clear APEX session
APEX_AUTHENTICATION.LOGOUT(
p_this_app => V('APP_ID'),
p_next_app_page_sess => V('APP_ID') || ':LOGIN'
);
END custom_logout;
Session Verification (sentry function):
CREATE OR REPLACE FUNCTION session_verify
RETURN BOOLEAN
IS
l_valid BOOLEAN := FALSE;
l_last_activity DATE;
BEGIN
-- Check session timeout (30 minutes)
SELECT MAX(created_on) INTO l_last_activity
FROM apex_workspace_activity_log
WHERE apex_user = :APP_USER
AND apex_session_id = :APP_SESSION;
IF l_last_activity > SYSDATE - (30/1440) THEN
l_valid := TRUE;
END IF;
RETURN l_valid;
EXCEPTION
WHEN NO_DATA_FOUND THEN
RETURN FALSE;
END session_verify;
Web Application Security Headers Configuration
Problem
Static HTML websites lack built-in security headers that protect against common web vulnerabilities. Without proper HTTP security headers, sites remain vulnerable to clickjacking attacks, cross-site scripting (XSS), MIME type confusion, and other browser-based exploits. Default web server configurations rarely include comprehensive security headers. Sites without HTTPS enforcement allow man-in-the-middle attacks. Missing Content Security Policy (CSP) enables injection attacks. Absent referrer policies leak sensitive URL information to third parties. Security scanning tools flag missing headers as critical vulnerabilities.
Solution
Configure comprehensive HTTP security headers using Apache .htaccess file to protect static HTML websites. Implement X-Frame-Options to prevent clickjacking, X-Content-Type-Options to block MIME sniffing, X-XSS-Protection for legacy browser XSS filters, Referrer-Policy to control information leakage, Permissions-Policy to disable unnecessary browser features, Content-Security-Policy to restrict resource loading, and Strict-Transport-Security (HSTS) to enforce HTTPS. Force HTTPS redirection for all traffic. These headers provide defense-in-depth against common web attacks without requiring application code changes.
Implementation
Create .htaccess File:
# Security Headers
# HTTP Strict Transport Security (HSTS)
Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
# Prevent clickjacking
Header always set X-Frame-Options "SAMEORIGIN"
# Prevent MIME type sniffing
Header always set X-Content-Type-Options "nosniff"
# Enable XSS filter
Header always set X-XSS-Protection "1; mode=block"
# Control referrer information
Header always set Referrer-Policy "strict-origin-when-cross-origin"
# Disable unnecessary features
Header always set Permissions-Policy "geolocation=(), microphone=(), camera=()"
# Content Security Policy
Header always set Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self'; connect-src 'self'; frame-ancestors 'self';"
# Force HTTPS
RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]
# DirectoryIndex
DirectoryIndex index.html index.htm
Security Header Explanations:
- Strict-Transport-Security (HSTS): Forces browsers to only connect via HTTPS for specified duration (1 year). Prevents downgrade attacks and insecure connections.
- X-Frame-Options: Prevents site from being embedded in iframes on other domains. Blocks clickjacking attacks where attackers overlay transparent frames to capture clicks.
- X-Content-Type-Options: Prevents browsers from MIME-sniffing responses away from declared content-type. Stops malicious content execution through type confusion.
- X-XSS-Protection: Enables browser's built-in XSS filter (legacy browsers). Detects and blocks reflected cross-site scripting attempts.
- Referrer-Policy: Controls what referrer information is sent with requests. Prevents leaking sensitive URL parameters to external sites.
- Permissions-Policy: Disables browser features not needed by the site (geolocation, camera, microphone). Reduces attack surface.
- Content-Security-Policy: Restricts sources from which content can be loaded. Prevents injection attacks by controlling script, style, and resource origins.
Verification Steps:
# 1. Upload .htaccess to web server root directory (same location as index.html)
# 2. Test HTTPS enforcement
# Visit: http://yourdomain.com
# Should automatically redirect to: https://yourdomain.com
# 3. Verify headers using curl
curl -I https://yourdomain.com
# Expected output includes:
# strict-transport-security: max-age=31536000; includeSubDomains; preload
# x-frame-options: SAMEORIGIN
# x-content-type-options: nosniff
# x-xss-protection: 1; mode=block
# referrer-policy: strict-origin-when-cross-origin
# permissions-policy: geolocation=(), microphone=(), camera=()
# content-security-policy: default-src 'self'; script-src 'self'; ...
Online Security Scanning:
# Run SSL/TLS configuration test
https://www.ssllabs.com/ssltest/
# Target Grade: A or A+
# Run HTTP security headers test
https://securityheaders.com
# Target Grade: A or A+
# Check for common vulnerabilities
https://observatory.mozilla.org
# Target Score: 90+
Content Security Policy Customization:
# Allow inline styles (required for many sites)
style-src 'self' 'unsafe-inline';
# Allow external fonts (Google Fonts, etc.)
font-src 'self' https://fonts.gstatic.com;
# Allow external scripts (CDN resources)
script-src 'self' https://cdn.example.com;
# Allow images from data URIs and external domains
img-src 'self' data: https:;
# Report CSP violations to monitoring endpoint
report-uri https://yourdomain.com/csp-report;
Common Issues and Fixes:
- CSP blocking inline styles: Add 'unsafe-inline' to style-src or move styles to external CSS files
- Mixed content warnings: Ensure all resources (images, scripts, CSS) load via HTTPS
- HSTS preventing HTTP access: Clear browser HSTS cache or wait for max-age expiration
- .htaccess not applying: Verify mod_headers and mod_rewrite are enabled in Apache configuration
- Headers not showing in curl: Check that .htaccess is in correct directory and has proper file permissions
Network Forensics & Analysis
IRC Protocol Analysis and Traffic Reconstruction
Problem
Internet Relay Chat (IRC) remains a communication protocol used in various contexts including legitimate technical communities and potential security incidents. When investigating network traffic, analysts need to extract and reconstruct IRC conversations from packet captures to understand communication patterns, identify participants, and analyze message content. Raw packet captures contain fragmented TCP streams that must be reassembled and parsed according to IRC protocol specifications. Without proper analysis techniques, critical communication details remain hidden in binary network data.
Solution
Use Wireshark's protocol analysis capabilities to filter, follow, and extract IRC traffic from packet captures. Apply display filters to isolate IRC traffic (typically port 6667 for plaintext, 6697 for SSL), use TCP stream following to reconstruct complete conversations, and parse IRC protocol commands (JOIN, PRIVMSG, NICK, MODE) to identify participants and message content. Extract server information from connection handshakes and MOTD sequences. Use Wireshark's statistics features to identify communication patterns and timeline analysis.
Implementation
Filter IRC Traffic:
# Standard IRC ports
tcp.port == 6667
# SSL/TLS IRC
tcp.port == 6697
# Follow TCP stream for complete conversation
Right-click packet → Follow → TCP Stream
Extract User Information:
# IRC user format: :nickname!username@hostname
# Extract from JOIN, PRIVMSG, NICK commands
# Example IRC protocol messages:
:user!~user@host.example.com JOIN #channel
:user!~user@host.example.com PRIVMSG #channel :message content
:user!~user@host.example.com NICK newnickname
Python Script for IRC Log Parsing:
import re
def extract_irc_users(log_content):
"""Extract unique usernames from IRC log"""
users = set()
# Match IRC user format: :nickname!username@hostname
pattern = r'^:([^!]+)!'
for line in log_content.split('\n'):
match = re.match(pattern, line)
if match:
users.add(match.group(1))
return sorted(users)
def parse_irc_messages(log_content):
"""Parse PRIVMSG commands from IRC log"""
messages = []
for line in log_content.split('\n'):
if 'PRIVMSG' in line:
# Extract sender, channel/target, and message
match = re.match(r':([^!]+)!.*PRIVMSG ([^ ]+) :(.*)', line)
if match:
messages.append({
'sender': match.group(1),
'target': match.group(2),
'message': match.group(3)
})
return messages
Identify IRC Server:
# Look for server identification in numeric replies:
# 001 RPL_WELCOME - Welcome message with server name
# 004 RPL_MYINFO - Server name, version, user/channel modes
# 375-376 - MOTD (Message of the Day)
# Wireshark filter for connection handshake:
irc and (frame contains "001" or frame contains "004")
Encrypted Traffic Analysis and Decryption
Problem
Network traffic often contains encrypted messages using PGP/GPG, SSL/TLS, or other cryptographic protocols. During security investigations or forensic analysis, encrypted content must be identified, extracted, and potentially decrypted to understand communication content. PGP-encrypted messages appear in network streams as Base64-encoded blocks that require proper key material for decryption. Without identifying encryption methods and locating corresponding keys, encrypted communications remain opaque to analysis.
Solution
Identify encrypted content by recognizing PGP message headers, certificate exchanges, and encryption protocol patterns in packet captures. Extract encrypted message blocks from network streams or file transfers. Use GPG/OpenPGP tools to decrypt messages when private keys are available. For SSL/TLS traffic, use session keys or private certificates to decrypt HTTPS communications in Wireshark. Analyze certificate chains to understand trust relationships and identify certificate authorities.
Implementation
Identify PGP Encrypted Content:
# PGP message markers in traffic:
-----BEGIN PGP MESSAGE-----
-----BEGIN PGP PRIVATE KEY BLOCK-----
-----BEGIN PGP PUBLIC KEY BLOCK-----
# Wireshark search for PGP content:
frame contains "BEGIN PGP"
# Extract from HTTP, FTP, or other protocols
http.file_data contains "PGP MESSAGE"
Extract and Decrypt PGP Messages:
# Save encrypted message to file
echo "-----BEGIN PGP MESSAGE-----
...encrypted content...
-----END PGP MESSAGE-----" > encrypted.asc
# Import private key
gpg --import private_key.asc
# Decrypt message
gpg --decrypt encrypted.asc > decrypted.txt
# Or using Python
import gnupg
gpg = gnupg.GPG()
gpg.import_keys(key_data)
decrypted = gpg.decrypt(encrypted_data)
print(decrypted.data.decode('utf-8'))
SSL/TLS Decryption in Wireshark:
# Using pre-master secret (modern approach):
# 1. Set SSLKEYLOGFILE environment variable before running browser
# 2. In Wireshark: Edit → Preferences → Protocols → TLS
# 3. Set "(Pre)-Master-Secret log filename" to the key log file
# Using RSA private key (legacy):
# Edit → Preferences → Protocols → TLS → RSA keys list
# Add: IP, Port, Protocol, Key file path
Certificate Chain Analysis:
# View TLS handshake details:
tls.handshake.type == 11 # Certificate message
# Extract certificate from packet:
# Expand: Transport Layer Security → Handshake Protocol → Certificate
# Right-click → Export Packet Bytes
# Analyze certificate with OpenSSL:
openssl x509 -in cert.der -inform DER -text -noout
# Shows: Issuer, Subject, Validity dates, Key usage, etc.
DNS Analysis and Domain Investigation
Problem
DNS queries and responses contain critical infrastructure information including domain-to-IP mappings, nameserver configurations, and service discovery details. During network investigations, analysts must identify what domains were queried, what IP addresses were returned, and what nameservers are authoritative for specific domains. DNS traffic analysis reveals communication patterns, identifies malicious domains, and maps network infrastructure. Without proper DNS analysis, understanding of network topology and external connections remains incomplete.
Solution
Use Wireshark to filter and analyze DNS traffic, examining both queries and responses. Extract A records (IPv4 addresses), AAAA records (IPv6 addresses), NS records (nameservers), SOA records (zone authority), and other DNS record types. Correlate DNS responses with subsequent network connections to verify IP address usage. Use external tools (dig, nslookup) to verify current DNS configurations and compare with captured traffic. Identify DNS anomalies including fast-flux domains, DGA patterns, or DNS tunneling attempts.
Implementation
Filter DNS Traffic:
# All DNS traffic
dns
# DNS queries only
dns.flags.response == 0
# DNS responses only
dns.flags.response == 1
# Query for specific domain
dns.qry.name contains "example.com"
# Specific record types
dns.qry.type == 1 # A record (IPv4)
dns.qry.type == 2 # NS record (nameserver)
dns.qry.type == 6 # SOA record (Start of Authority)
dns.qry.type == 28 # AAAA record (IPv6)
Extract Domain-to-IP Mappings:
# Wireshark Statistics → Resolved Addresses
# Shows all DNS name resolutions in capture
# Export DNS queries and responses:
tshark -r capture.pcap -Y "dns" -T fields \
-e dns.qry.name -e dns.a -e dns.aaaa \
| sort -u
# Output format: domain, IPv4, IPv6
Identify Authoritative Nameservers:
# Filter for SOA records in responses:
dns.resp.type == 6
# SOA record contains:
# - Primary nameserver (MNAME)
# - Responsible party email
# - Serial number, refresh, retry, expire timers
# Extract primary nameserver from packet details:
# Domain Name System → Answers → Start of Authority
# Primary name server: ns1.example.com
DNS Query Analysis Script:
#!/usr/bin/env python3
import pyshark
def analyze_dns_traffic(pcap_file):
"""Extract DNS queries and responses from pcap"""
capture = pyshark.FileCapture(
pcap_file,
display_filter='dns'
)
dns_records = []
for packet in capture:
if hasattr(packet, 'dns'):
dns = packet.dns
# Extract query information
if hasattr(dns, 'qry_name'):
query = {
'query': dns.qry_name,
'type': dns.qry_type if hasattr(dns, 'qry_type') else None,
'response_ip': dns.a if hasattr(dns, 'a') else None,
'timestamp': packet.sniff_time
}
dns_records.append(query)
capture.close()
return dns_records
# Usage
records = analyze_dns_traffic('capture.pcap')
for record in records:
print(f"{record['query']} → {record['response_ip']}")
Verify DNS Configuration with External Tools:
# Query A record
dig example.com A
# Query specific nameserver
dig @8.8.8.8 example.com
# Query SOA record for authoritative server
dig example.com SOA
# Trace DNS resolution path
dig +trace example.com
# Check all record types
dig example.com ANY
Network Connection Pattern Analysis
Problem
Understanding which hosts communicated with each other, traffic volumes, and connection patterns is essential for network forensics. Raw packet captures contain thousands of individual packets that must be aggregated into meaningful conversations and statistics. Identifying high-volume connections, unusual traffic patterns, or connections to suspicious IP addresses requires analysis tools that summarize packet-level data into connection-level insights. Manual packet inspection is impractical for captures containing millions of packets.
Solution
Use Wireshark's Statistics features to aggregate packet data into conversation summaries, protocol hierarchies, and endpoint statistics. Analyze conversations to identify source-destination pairs, traffic volumes, and connection durations. Use protocol hierarchy statistics to understand traffic composition. Export conversation data to CSV for additional analysis in spreadsheet tools or scripts. Apply GeoIP databases to identify geographic locations of IP addresses. Correlate connection patterns with timestamps to build activity timelines.
Implementation
View Network Conversations:
# Wireshark: Statistics → Conversations
# Shows all source-destination pairs with:
# - Packet count
# - Byte count
# - Duration
# - Relative start time
# Tabs available:
# - Ethernet (MAC addresses)
# - IPv4 (IP addresses)
# - IPv6
# - TCP (includes ports)
# - UDP (includes ports)
# Export to CSV for analysis:
# Copy button → Paste into spreadsheet or text file
Identify High-Traffic Connections:
# Filter by IP address after identifying in conversations:
ip.addr == 192.168.1.100
# Filter specific conversation:
ip.addr == 192.168.1.100 && ip.addr == 10.0.0.1
# Sort conversations by:
# - Packets (click Packets column header)
# - Bytes (click Bytes column header)
# - Duration (click Duration column header)
Protocol Distribution Analysis:
# Wireshark: Statistics → Protocol Hierarchy
# Shows percentage breakdown of protocols:
# - Ethernet
# - IPv4
# - TCP (80% of traffic)
# - HTTP
# - HTTPS
# - UDP (20% of traffic)
# - DNS
# - DHCP
# Identify unusual protocols or unexpected traffic
Extract IP Addresses for Bulk Analysis:
# Export unique destination IPs from specific source:
tshark -r capture.pcap \
-Y "ip.src == 10.0.0.7" \
-T fields -e ip.dst \
| sort -u > destination_ips.txt
# Bulk GeoIP lookup (using online service or local database)
# Identify country, city, organization for each IP
# Example Python script for bulk IP analysis:
import requests
def geolocate_ips(ip_list):
"""Get geographic location for list of IPs"""
results = []
for ip in ip_list:
# Using ipapi.co as example (rate limited)
response = requests.get(f'https://ipapi.co/{ip}/json/')
if response.status_code == 200:
data = response.json()
results.append({
'ip': ip,
'country': data.get('country_name'),
'city': data.get('city'),
'org': data.get('org')
})
return results
Timeline Analysis:
# Statistics → I/O Graph
# Visualize traffic over time
# X-axis: Time
# Y-axis: Packets, bytes, or advanced metrics
# Create filters for different traffic types:
# Graph 1: tcp (TCP traffic - blue)
# Graph 2: udp (UDP traffic - red)
# Graph 3: dns (DNS traffic - green)
# Identify traffic spikes, patterns, or anomalies
Endpoint Statistics:
# Statistics → Endpoints → IPv4
# Lists all IP addresses in capture with:
# - Total packets sent/received
# - Total bytes sent/received
# - Geographic location (if GeoIP database configured)
# Identify most active hosts
# Sort by Packets or Bytes columns
File Extraction from Network Traffic
Problem
Network traffic often includes file transfers via HTTP, FTP, SMB, or other protocols. During forensic investigations, analysts need to extract these files from packet captures to analyze their content, identify malware, or reconstruct incident timelines. Files are fragmented across multiple packets and protocols, requiring reassembly of TCP streams and protocol-specific parsing. Binary file data mixed with protocol headers must be separated and reconstructed into usable files.
Solution
Use Wireshark's file export capabilities to extract transferred files from HTTP, SMB, and other protocols. Follow TCP streams to manually extract file content from FTP data connections. Use Wireshark's "Export Objects" feature for automatic file extraction from HTTP traffic. For encrypted transfers, decrypt the traffic first before attempting file extraction. Verify extracted files using hash values and file headers. Use protocol-specific tools (like NetworkMiner) for automated file extraction across multiple protocols.
Implementation
Extract HTTP Objects:
# File → Export Objects → HTTP
# Shows all HTTP file transfers:
# - Hostname
# - Content Type
# - Size
# - Filename
# Select files to save
# Click "Save" or "Save All"
# Filter exported objects:
# - Text filter box to search by filename
# - Content Type column to filter by MIME type
Extract Files from FTP:
# FTP uses separate control (port 21) and data connections
# Control connection shows commands:
ftp
# Find RETR (retrieve) or STOR (store) commands
# Note the filename being transferred
# Filter for FTP data:
ftp-data
# Follow TCP Stream of data connection
# Save raw data (Show and save data as: Raw)
# Save to file with appropriate extension
Extract Files from SMB/CIFS:
# File → Export Objects → SMB
# Shows files transferred via Windows file sharing
# Protocol hierarchy shows SMB traffic:
# Statistics → Protocol Hierarchy → SMB2/SMB
Verify Extracted Files:
# Calculate hash of extracted file
sha256sum extracted_file.bin
# Check file type (Linux/Mac)
file extracted_file.bin
# Check for malware (if suspicious)
# Upload to VirusTotal or scan with antivirus
# Examine file headers
hexdump -C extracted_file.bin | head -20
# Common file signatures:
# PDF: %PDF
# PNG: 89 50 4E 47
# ZIP: 50 4B 03 04
# JPEG: FF D8 FF
Database & Development
SQL Metadata Queries for Dynamic Schema Discovery
Problem
Applications often need to discover database schema information programmatically for dynamic form generation, data validation, or automated documentation. Hard-coding table and column names throughout application code creates maintenance overhead when schema changes occur. Without metadata queries, developers manually track column data types, constraints, and relationships, leading to errors when schema evolves. Applications cannot adapt to schema changes without code modifications. Dynamic report builders, data import tools, and admin interfaces require runtime access to table structures but lack mechanisms to query database metadata efficiently.
Solution
Use Oracle data dictionary views (USER_TAB_COLUMNS, ALL_TAB_COLUMNS, USER_CONSTRAINTS, USER_INDEXES) to query schema metadata programmatically. Build reusable PL/SQL functions that return table structures, column definitions, and constraint information. Create dynamic SQL generators that adapt to schema changes automatically. Implement column data type validation using metadata rather than hard-coded type checks. Use metadata queries to generate interactive reports, form fields, and API documentation. Cache frequently accessed metadata to minimize dictionary query overhead while refreshing when DDL changes occur.
Implementation
Query Table Columns:
-- Get all columns for a table with data types
SELECT
column_name,
data_type,
data_length,
data_precision,
data_scale,
nullable,
data_default
FROM user_tab_columns
WHERE table_name = 'MY_TABLE'
ORDER BY column_id;
-- Get columns across schemas
SELECT
owner,
table_name,
column_name,
data_type
FROM all_tab_columns
WHERE table_name = 'EMPLOYEES'
AND owner IN ('HR_SCHEMA', 'PAYROLL_SCHEMA');
Dynamic Column List Function:
CREATE OR REPLACE FUNCTION get_table_columns (
p_table_name IN VARCHAR2,
p_delimiter IN VARCHAR2 DEFAULT ','
) RETURN VARCHAR2
IS
l_columns VARCHAR2(4000);
BEGIN
SELECT LISTAGG(column_name, p_delimiter)
WITHIN GROUP (ORDER BY column_id)
INTO l_columns
FROM user_tab_columns
WHERE table_name = UPPER(p_table_name);
RETURN l_columns;
END get_table_columns;
-- Usage: Generate SELECT statement dynamically
DECLARE
l_sql VARCHAR2(4000);
BEGIN
l_sql := 'SELECT ' || get_table_columns('USERS') ||
' FROM users WHERE status = ''ACTIVE''';
EXECUTE IMMEDIATE l_sql;
END;
Column Data Type Validation:
CREATE OR REPLACE FUNCTION validate_column_type (
p_table_name IN VARCHAR2,
p_column_name IN VARCHAR2,
p_expected_type IN VARCHAR2
) RETURN BOOLEAN
IS
l_actual_type VARCHAR2(106);
BEGIN
SELECT data_type INTO l_actual_type
FROM user_tab_columns
WHERE table_name = UPPER(p_table_name)
AND column_name = UPPER(p_column_name);
RETURN UPPER(l_actual_type) = UPPER(p_expected_type);
EXCEPTION
WHEN NO_DATA_FOUND THEN
RETURN FALSE;
END validate_column_type;
-- Usage in dynamic form validation
IF NOT validate_column_type('RECORDS', 'START_DATE', 'DATE') THEN
RAISE_APPLICATION_ERROR(-20001, 'Invalid column type for START_DATE');
END IF;
Query Primary Key Constraints:
-- Find primary key columns for a table
SELECT
acc.column_name,
acc.position
FROM all_constraints ac
JOIN all_cons_columns acc ON ac.constraint_name = acc.constraint_name
WHERE ac.table_name = 'USER_OPERATIONS'
AND ac.constraint_type = 'P'
ORDER BY acc.position;
Query Foreign Key Relationships:
-- Find all foreign keys referencing a table
SELECT
ac.constraint_name,
ac.table_name as child_table,
acc.column_name as child_column,
ac_pk.table_name as parent_table,
acc_pk.column_name as parent_column
FROM all_constraints ac
JOIN all_cons_columns acc ON ac.constraint_name = acc.constraint_name
JOIN all_constraints ac_pk ON ac.r_constraint_name = ac_pk.constraint_name
JOIN all_cons_columns acc_pk ON ac_pk.constraint_name = acc_pk.constraint_name
WHERE ac_pk.table_name = 'MAIN_RECORDS'
AND ac.constraint_type = 'R';
Generate CREATE TABLE Statement:
CREATE OR REPLACE FUNCTION generate_create_table (
p_table_name IN VARCHAR2
) RETURN CLOB
IS
l_sql CLOB;
BEGIN
l_sql := 'CREATE TABLE ' || p_table_name || ' (' || CHR(10);
FOR col IN (
SELECT
column_name,
data_type,
data_length,
data_precision,
data_scale,
nullable,
column_id
FROM user_tab_columns
WHERE table_name = UPPER(p_table_name)
ORDER BY column_id
) LOOP
IF col.column_id > 1 THEN
l_sql := l_sql || ',' || CHR(10);
END IF;
l_sql := l_sql || ' ' || col.column_name || ' ';
-- Add data type
IF col.data_type IN ('VARCHAR2', 'CHAR') THEN
l_sql := l_sql || col.data_type || '(' || col.data_length || ')';
ELSIF col.data_type = 'NUMBER' AND col.data_precision IS NOT NULL THEN
l_sql := l_sql || col.data_type || '(' || col.data_precision;
IF col.data_scale IS NOT NULL AND col.data_scale > 0 THEN
l_sql := l_sql || ',' || col.data_scale;
END IF;
l_sql := l_sql || ')';
ELSE
l_sql := l_sql || col.data_type;
END IF;
-- Add nullable constraint
IF col.nullable = 'N' THEN
l_sql := l_sql || ' NOT NULL';
END IF;
END LOOP;
l_sql := l_sql || CHR(10) || ');';
RETURN l_sql;
END generate_create_table;
Dynamic APEX LOV Query:
-- Create list of values dynamically from any table
CREATE OR REPLACE FUNCTION get_lov_query (
p_table_name IN VARCHAR2,
p_display_column IN VARCHAR2,
p_return_column IN VARCHAR2
) RETURN VARCHAR2
IS
BEGIN
RETURN 'SELECT ' || p_display_column || ' d, ' ||
p_return_column || ' r FROM ' || p_table_name ||
' ORDER BY ' || p_display_column;
END get_lov_query;
-- Use in APEX LOV definition
-- Type: SQL Query
-- Query: BEGIN RETURN get_lov_query('DEPARTMENTS', 'DEPT_NAME', 'DEPT_ID'); END;
Metadata Cache Table:
CREATE TABLE metadata_cache (
table_name VARCHAR2(128),
column_name VARCHAR2(128),
data_type VARCHAR2(106),
is_nullable VARCHAR2(1),
cached_date TIMESTAMP DEFAULT SYSTIMESTAMP,
PRIMARY KEY (table_name, column_name)
);
-- Refresh procedure
CREATE OR REPLACE PROCEDURE refresh_metadata_cache
IS
BEGIN
DELETE FROM metadata_cache;
INSERT INTO metadata_cache (
table_name, column_name, data_type, is_nullable
)
SELECT
table_name,
column_name,
data_type,
nullable
FROM user_tab_columns;
COMMIT;
END refresh_metadata_cache;
Migration & DevOps Automation
Move Files Exceeding Path Length Limit (Version Control Migration)
Problem
GitHub and Windows impose a 260-character limit on full file paths. During version control system migrations, files with deeply nested directory structures or long filenames exceed this limit, causing push failures and blocking repository migration. Manual identification and reorganization of these files is time-consuming and error-prone across repositories containing thousands of files. Without automated path length detection, migration projects stall on edge cases discovered late in the process.
Solution
Create a PowerShell script that recursively scans the repository, identifies files exceeding the 260-character path limit, and moves them to a designated folder while preserving their original directory structure. This isolates problematic files for manual review and restructuring without blocking the main migration workflow. The script provides detailed logging of moved files including their original path lengths. After migration, relocated files can be restructured with shorter paths and reintegrated into the repository.
Implementation
# Move files with long paths while preserving directory structure
# GitHub/Windows limit: 260 characters for full path
param(
[string]$SourcePath = ".",
[string]$DestPath = ".\LongPaths",
[int]$MaxPathLength = 260
)
# Get absolute source path for relative path calculation
$sourceFullPath = (Resolve-Path $SourcePath).Path
# Find and move files exceeding path limit
Get-ChildItem -Path $SourcePath -Recurse -File | Where-Object {
$_.FullName.Length -gt $MaxPathLength
} | ForEach-Object {
$file = $_
$relativePath = $file.FullName.Substring($sourceFullPath.Length + 1)
$destFile = Join-Path $DestPath $relativePath
$destDir = Split-Path $destFile -Parent
# Create destination directory structure
if (-not (Test-Path $destDir)) {
New-Item -ItemType Directory -Path $destDir -Force | Out-Null
}
Write-Host "Moving: $relativePath (Length: $($file.FullName.Length))"
Move-Item -Path $file.FullName -Destination $destFile -Force
}
Write-Host "Migration complete. Long-path files moved to: $DestPath"
Usage:
# Run from repository root
.\Move-LongPaths.ps1
# Custom parameters
.\Move-LongPaths.ps1 -SourcePath "C:\Projects\MyApp" -DestPath "C:\LongPaths" -MaxPathLength 250
Remove Archive Extensions from Filenames
Problem
Legacy version control systems sometimes append archive suffixes to file revisions. When migrating to Git, these archive suffixes persist in filenames, creating confusion about which files are current versus historical. The archive suffix serves no purpose in Git's version control model and clutters the repository namespace. Manually renaming hundreds or thousands of files across multiple repositories is impractical and introduces human error risk.
Solution
Implement a PowerShell script that recursively identifies all files with archive suffixes in their names and removes these suffixes while preserving the rest of the filename and extension. The script operates on the file system before Git commits, ensuring clean filenames enter the repository from the initial import. Logging output shows each renamed file for audit purposes. The script handles edge cases including files with multiple hyphens and validates that removals don't create filename collisions.
Implementation
# Remove archive suffix from filenames
# Handles files like "document-arc.txt" -> "document.txt"
param(
[string]$Path = ".",
[string]$Suffix = "-arc",
[switch]$WhatIf
)
Get-ChildItem -Path $Path -Recurse -File | Where-Object {
$_.Name -match $Suffix
} | ForEach-Object {
$file = $_
$newName = $file.Name -replace $Suffix, ''
$newPath = Join-Path $file.DirectoryName $newName
# Check for naming collision
if (Test-Path $newPath) {
Write-Warning "Collision detected: $newPath already exists. Skipping $($file.Name)"
} else {
Write-Host "Renaming: $($file.Name) -> $newName"
if (-not $WhatIf) {
Rename-Item -Path $file.FullName -NewName $newName
}
}
}
if ($WhatIf) {
Write-Host "`nDry run complete. Use without -WhatIf to apply changes."
}
Usage:
# Preview changes without modifying files
.\Remove-ArchiveSuffix.ps1 -WhatIf
# Execute renaming
.\Remove-ArchiveSuffix.ps1
# Target specific directory with custom suffix
.\Remove-ArchiveSuffix.ps1 -Path "C:\Projects\Legacy" -Suffix "-old"
Move Large Files for Git LFS Processing
Problem
GitHub repositories have a 100MB file size limit. Files exceeding this limit cause push failures and block repository migration. Large binary files (videos, datasets, compiled artifacts, installers) are common in legacy repositories but incompatible with standard Git workflows. Git Large File Storage (LFS) is required for files over 100MB, but identifying which files need LFS treatment across thousands of repository files is time-intensive. Manual inspection misses edge cases and delays migration timelines.
Solution
Create a PowerShell script that scans the repository recursively, identifies files exceeding 100MB, and moves them to a designated folder for Git LFS configuration or alternate storage strategies. The script preserves directory structure and provides detailed size reporting. After relocation, administrators can evaluate whether files should use Git LFS, remain in external storage, or be excluded from version control. This separates the large file problem from the main migration workflow, preventing 100MB+ files from blocking entire repository imports.
Implementation
# Move files larger than 100MB (GitHub limit)
# Preserves directory structure for Git LFS setup
param(
[string]$SourcePath = ".",
[string]$DestPath = ".\LargeFiles",
[int]$MaxSizeMB = 100
)
$maxSizeBytes = $MaxSizeMB * 1MB
$sourceFullPath = (Resolve-Path $SourcePath).Path
Get-ChildItem -Path $SourcePath -Recurse -File | Where-Object {
$_.Length -gt $maxSizeBytes
} | ForEach-Object {
$file = $_
$relativePath = $file.FullName.Substring($sourceFullPath.Length + 1)
$destFile = Join-Path $DestPath $relativePath
$destDir = Split-Path $destFile -Parent
# Create destination directory structure
if (-not (Test-Path $destDir)) {
New-Item -ItemType Directory -Path $destDir -Force | Out-Null
}
$sizeMB = [math]::Round($file.Length / 1MB, 2)
Write-Host "Moving: $relativePath (Size: $sizeMB MB)"
Move-Item -Path $file.FullName -Destination $destFile -Force
}
Write-Host "`nLarge files moved to: $DestPath"
Write-Host "Configure Git LFS or alternate storage for these files."
Usage:
# Run from repository root
.\Move-LargeFiles.ps1
# Custom size threshold (e.g., 50MB)
.\Move-LargeFiles.ps1 -MaxSizeMB 50
# Custom paths
.\Move-LargeFiles.ps1 -SourcePath "C:\VCS-Export" -DestPath "C:\LFS-Staging"
Post-Migration Git LFS Setup:
# Install Git LFS
git lfs install
# Track large file types
git lfs track "*.iso"
git lfs track "*.mp4"
git lfs track "*.zip"
# Add .gitattributes and large files back to repository
git add .gitattributes
git add path/to/large/files/
git commit -m "Add large files via Git LFS"
GitHub Enterprise Pull Request Template Configuration
Problem
Development teams working with mixed technology stacks require different pull request templates for different types of changes. Standard code changes need basic testing checklists and change descriptions, while specialized application changes require validation covering page modifications, database objects, SQL security, session state management, and deployment procedures. Using a single generic template forces developers to delete irrelevant sections or ignore inapplicable checklist items, reducing template effectiveness and creating inconsistent PR documentation. GitHub Enterprise supports multiple PR templates through a dropdown selector, but this feature requires specific directory structure and may not function correctly in all GHE versions or configurations.
Solution
Implement a combined pull request template that includes conditional sections for both standard code changes and specialized application testing. Structure the template with clear section headers and instructions directing developers to delete the section they're not using. This approach works universally across GitHub Enterprise versions without requiring special directory structures or dropdown functionality. The combined template provides comprehensive checklists for both workflow types while maintaining simplicity—developers simply delete the irrelevant section before submitting their PR.
Implementation
Combined Template Structure:
---
# STANDARD PULL REQUEST
## Description
## Type of Change
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that causes existing functionality to not work as expected)
- [ ] Documentation update
- [ ] Performance improvement
- [ ] Code refactoring
- [ ] Dependency update
## Related Issues
## Changes Made
-
-
## Testing
### Test Configuration
- OS:
- Browser (if applicable):
- Database (if applicable):
### Test Cases
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing completed
## Checklist
- [ ] Code follows project style guidelines
- [ ] Self-review completed
- [ ] Comments added to complex code sections
- [ ] Documentation updated
- [ ] No new warnings generated
- [ ] Tests added/updated for changes
- [ ] All tests pass locally
---
# APPLICATION TESTING
## Application Details
**Application ID:**
**Application Name:**
**Target Environment:**
**Version:**
## Description
## Type of Change
- [ ] New page/region
- [ ] Page item modification
- [ ] Dynamic action change
- [ ] SQL/PLSQL code update
- [ ] Database object change (table, view, package, procedure)
- [ ] Authorization/authentication modification
- [ ] Bug fix
- [ ] Performance optimization
- [ ] Report/Grid modification
- [ ] Process/Computation change
- [ ] Validation/Error handling
## Pages Modified
| Page ID | Page Name | Changes |
|---------|-----------|---------|
| | | |
## Database Objects Modified
- [ ] Tables
- [ ] Views
- [ ] Packages
- [ ] Procedures/Functions
- [ ] Triggers
- [ ] Sequences
### DDL/DML Scripts Required
- [ ] Yes (include migration script path/details)
- [ ] No
**Script Location:**
## Security Impact
- [ ] Authorization scheme changes
- [ ] Session state modifications
- [ ] Authentication changes
- [ ] Page access control
- [ ] Item-level security
## Testing Performed
### Functional Testing
- [ ] All page items render correctly
- [ ] Dynamic actions fire as expected
- [ ] Validations trigger appropriately
- [ ] Processes execute successfully
- [ ] Branches navigate correctly
- [ ] Modal dialogs function properly
- [ ] Interactive Grid/Report operations work
- [ ] File uploads/downloads (if applicable)
### Data Testing
- [ ] Insert operations
- [ ] Update operations
- [ ] Delete operations
- [ ] Query performance acceptable
- [ ] Data validation rules enforced
- [ ] Foreign key constraints respected
### Cross-Browser Testing
- [ ] Chrome
- [ ] Firefox
- [ ] Edge
- [ ] Safari (if applicable)
### Error Handling
- [ ] Invalid input handled gracefully
- [ ] Error messages display correctly
- [ ] Exception handling tested
- [ ] Constraint violations caught
## SQL/PLSQL Code Review
- [ ] Bind variables used (no SQL injection risk)
- [ ] Proper exception handling
- [ ] No hard-coded values
- [ ] Efficient queries (explain plan reviewed if complex)
- [ ] Proper commit/rollback handling
- [ ] No SELECT * usage
## Performance Considerations
## Dependencies
- [ ] New plugins required
- [ ] External JavaScript libraries
- [ ] New database objects
- [ ] Supporting Objects installation needed
- [ ] Configuration changes required
## Rollback Plan
## Deployment Notes
- [ ] Run pre-deployment scripts
- [ ] Import application (replace/upgrade)
- [ ] Run post-deployment scripts
- [ ] Clear cache
- [ ] Test in target environment
## Checklist
- [ ] Code reviewed for security vulnerabilities
- [ ] Session state properly managed
- [ ] No exposed SQL injection vectors
- [ ] Authorization checks on all protected pages
- [ ] Error messages don't expose sensitive information
- [ ] All substitution strings properly escaped
- [ ] Region/item conditions verified
- [ ] Application settings documented
---
## Screenshots
## Additional Notes
## Reviewer Notes
File Location:
.github/
└── pull_request_template.md
Why Combined Template Works:
- No special directory structure required (PULL_REQUEST_TEMPLATE/ folder unnecessary)
- Compatible with all GitHub Enterprise versions
- Works with GitHub Desktop workflow where developers may not have Git CLI installed
- Simple developer experience: open PR, delete irrelevant section, fill out applicable checklist
- Maintains comprehensive coverage for both standard and application-specific validation requirements
Alternative: Multiple Template Directory:
# Directory structure for dropdown selector
.github/
└── PULL_REQUEST_TEMPLATE/
├── standard.md
└── application_testing.md
# This approach shows template dropdown in PR creation UI
# However, single root-level template (pull_request_template.md) takes precedence
# Dropdown feature requires GHE 2.19+ and may not render correctly in all cases
# If dropdown doesn't appear despite correct configuration, use combined template approach instead
Troubleshooting Template Selection:
- Verify no pull_request_template.md file exists in .github/ root when using PULL_REQUEST_TEMPLATE/ directory
- Check repository default branch—templates only load from default branch
- Clear browser cache (Ctrl+F5) when testing template changes
- Confirm GitHub Enterprise version supports multiple templates (2.19+)
- Check organization-level .github repository isn't overriding repo-level templates
Password Recovery & Cryptography
Dictionary-Based Password Cracking with John the Ripper
Problem
Security assessments and forensic investigations often require recovering passwords from hash files to access encrypted systems, validate password policies, or understand security postures. Manual password guessing is impractical for strong passwords. Brute force attacks take exponential time as password complexity increases. Without efficient password recovery tools, analysts cannot assess whether passwords meet security requirements or recover access to critical systems during incident response. Password hashes from various systems (Unix shadow files, Windows SAM, application databases) require format-specific cracking approaches.
Solution
Use John the Ripper with dictionary-based attacks leveraging wordlists like rockyou.txt to efficiently test millions of common passwords against hash files. Dictionary attacks prioritize likely passwords based on real-world password usage patterns, dramatically reducing time to crack weak passwords compared to brute force. John the Ripper supports multiple hash formats and can auto-detect hash types. The tool can process single hash files or entire directories of hashes simultaneously. Results are cached, allowing recovery of previously cracked passwords without re-running attacks. This approach balances effectiveness against weak passwords with computational efficiency.
Implementation
Basic Dictionary Attack:
# Single hash file
john --wordlist=/usr/share/wordlists/rockyou.txt hash_file.txt
# Multiple hash files in directory
john --wordlist=/usr/share/wordlists/rockyou.txt /path/to/hash_folder/*
# Specific hash format
john --wordlist=/usr/share/wordlists/rockyou.txt --format=Raw-MD5 hash_file.txt
# Common hash formats:
# Raw-MD5, Raw-SHA1, Raw-SHA256, Raw-SHA512
# NT (Windows NTLM), LM (Windows LAN Manager)
# bcrypt, scrypt, PBKDF2
# descrypt (traditional Unix crypt)
View Cracked Passwords:
# Show all cracked passwords from previous runs
john --show hash_file.txt
# Show with specific format
john --show --format=Raw-MD5 hash_file.txt
# Output format: username:password:uid:gid:gecos:home:shell
Wordlist Locations:
# Kali Linux / Debian-based
/usr/share/wordlists/rockyou.txt
/usr/share/wordlists/rockyou.txt.gz # Compressed version
# Extract compressed wordlist
gunzip /usr/share/wordlists/rockyou.txt.gz
# Other common wordlists
/usr/share/wordlists/metasploit/
/usr/share/seclists/Passwords/
Processing Multiple Files:
# Crack all hash files in directory
john --wordlist=/usr/share/wordlists/rockyou.txt hash1.txt hash2.txt hash3.txt
# Using wildcard
john --wordlist=/usr/share/wordlists/rockyou.txt /evidence/*.hash
# Recursive directory processing
find /path/to/hashes -type f -name "*.txt" -exec john --wordlist=/usr/share/wordlists/rockyou.txt {} \;
Performance Optimization:
# Use multiple cores
john --wordlist=/usr/share/wordlists/rockyou.txt --fork=4 hash_file.txt
# Session management for long-running cracks
john --wordlist=/usr/share/wordlists/rockyou.txt --session=my_session hash_file.txt
# Restore interrupted session
john --restore=my_session
# Show status of running session
john --status=my_session
Hash Format Detection:
# Let John detect hash format automatically
john --wordlist=/usr/share/wordlists/rockyou.txt hash_file.txt
# List supported formats
john --list=formats
# List subformats for specific type
john --list=subformats
# Identify hash type from sample
john --list=format-details | grep -i md5
Advanced Wordlist Techniques:
# Use rules to generate variations
john --wordlist=/usr/share/wordlists/rockyou.txt --rules hash_file.txt
# Combine multiple wordlists
cat wordlist1.txt wordlist2.txt > combined.txt
john --wordlist=combined.txt hash_file.txt
# Incremental mode (brute force with character sets)
john --incremental hash_file.txt
# Incremental with specific charset
john --incremental=Digits hash_file.txt # Numbers only
john --incremental=Alpha hash_file.txt # Letters only
Output and Reporting:
# Save cracked passwords to file
john --show hash_file.txt > cracked_passwords.txt
# Show only usernames and passwords
john --show --format=Raw-MD5 hash_file.txt | cut -d: -f1,2
# Count cracked vs uncracked
john --show hash_file.txt | wc -l # Cracked count
grep -c ':' hash_file.txt # Total hash count
Common Hash File Formats:
# Unix /etc/shadow format
username:$6$salt$hash:18000:0:99999:7:::
# Simple hash list (one per line)
5f4dcc3b5aa765d61d8327deb882cf99
098f6bcd4621d373cade4e832627b4f6
# Username:hash pairs
admin:5f4dcc3b5aa765d61d8327deb882cf99
user:098f6bcd4621d373cade4e832627b4f6
# NTLM hashes (Windows)
Administrator:500:aad3b435b51404eeaad3b435b51404ee:31d6cfe0d16ae931b73c59d7e0c089c0:::
Best Practices:
- Always specify hash format when known to avoid misdetection and improve performance
- Use session names for long-running cracks to enable pause/resume
- Check john.pot file (~/.john/john.pot) for previously cracked passwords before re-running
- Combine dictionary attacks with rules for better coverage of password variations
- Monitor system resources—large wordlists consume significant RAM and CPU
- Keep wordlists updated with recently leaked password databases for improved success rates
Query Languages & Data Analysis
Network Flow Analysis with Kusto Query Language (KQL)
Problem
Network flow logs contain vast amounts of connection data including source/destination IPs, ports, byte counts, and packet information. Security analysts need to identify top talkers, suspicious connections, unusual traffic patterns, and encoded data hidden in flow metadata. Manual analysis of thousands or millions of flow records is impractical. SQL-based tools lack the aggregation and time-series analysis capabilities needed for effective network forensics. Traditional packet analysis tools focus on individual packets rather than connection-level patterns. Without proper query capabilities, critical security insights remain buried in raw flow log data.
Solution
Use Kusto Query Language (KQL) to analyze network flow logs with powerful aggregation, filtering, and pattern matching capabilities. KQL excels at summarizing large datasets, identifying top values, and extracting encoded data from text fields. Import flow logs into Azure Data Explorer or use local Kusto tools for analysis. Write queries that aggregate traffic by source IP to find top talkers, filter by destination port to identify protocol usage, and decode Base64-encoded packet data to reveal hidden messages. KQL's pipe-based syntax enables building complex analysis workflows by chaining simple operations. Results can be visualized directly or exported for further analysis.
Implementation
Setup and Data Import:
// Azure Data Explorer (online)
// 1. Navigate to https://dataexplorer.azure.com/
// 2. Create free cluster or use existing
// 3. Create database
// 4. Import data: Home → Ingest data
// 5. Select file format (CSV, JSON, etc.)
// Kusto.Explorer (desktop tool)
// 1. Download from https://aka.ms/ke
// 2. Connect to cluster
// 3. Right-click database → Import data
// 4. Select source file
Query Top Talker by Source IP:
// Find source IP with highest traffic volume
NetworkFlowTable
| summarize TotalBytes = sum(BytesSrcToDest + BytesDestToSrc) by SrcPublicIps
| top 1 by TotalBytes desc
| project SrcPublicIps
// Alternative: Count flows instead of bytes
NetworkFlowTable
| summarize FlowCount = count() by SrcPublicIps
| top 1 by FlowCount desc
| project SrcPublicIps
// Top 10 talkers with traffic statistics
NetworkFlowTable
| summarize
TotalBytes = sum(BytesSrcToDest + BytesDestToSrc),
FlowCount = count(),
UniqueDestinations = dcount(DestIp)
by SrcPublicIps
| top 10 by TotalBytes desc
Identify IPs Using Specific Ports:
// Find all source IPs using ports 80 or 443
NetworkFlowTable
| where DestPort in (80, 443)
| distinct SrcPublicIps
// Find IP using both ports
NetworkFlowTable
| where DestPort in (80, 443)
| summarize Ports = make_set(DestPort) by SrcPublicIps
| where array_length(Ports) == 2
| project SrcPublicIps
// Traffic breakdown by port
NetworkFlowTable
| where DestPort in (80, 443)
| summarize FlowCount = count(), TotalBytes = sum(BytesSrcToDest) by SrcPublicIps, DestPort
| order by TotalBytes desc
Decode Base64 Packet Data:
// Decode packet_data field for specific source IP
NetworkFlowTable
| where SrcPublicIps == "top_talker_ip_from_query"
| where isnotempty(packet_data)
| project DecodedData = base64_decode_tostring(packet_data)
| distinct DecodedData
// Search for encoded data containing specific pattern
NetworkFlowTable
| where isnotempty(packet_data)
| extend DecodedData = base64_decode_tostring(packet_data)
| where DecodedData contains "flag" or DecodedData contains "password"
| project SrcPublicIps, DestIp, DecodedData
// Decode and analyze packet sizes
NetworkFlowTable
| where SrcPublicIps == "suspicious_ip"
| where isnotempty(packet_data)
| extend DecodedData = base64_decode_tostring(packet_data)
| project SrcPublicIps, PacketSize = strlen(DecodedData), DecodedData
Advanced Analysis Patterns:
// Identify unusual port usage
NetworkFlowTable
| summarize ConnectionCount = count() by DestPort
| where ConnectionCount < 10 // Rarely used ports
| order by ConnectionCount desc
// Find connections to specific subnets
NetworkFlowTable
| where DestIp startswith "10.0." or DestIp startswith "192.168."
| summarize InternalConnections = count() by SrcPublicIps
| top 20 by InternalConnections desc
// Time-based analysis
NetworkFlowTable
| where FlowStartTime between (datetime(2025-01-01) .. datetime(2025-01-31))
| summarize HourlyFlows = count() by bin(FlowStartTime, 1h)
| render timechart
// Protocol distribution
NetworkFlowTable
| summarize FlowCount = count() by L4Protocol
| extend Percentage = FlowCount * 100.0 / toscalar(NetworkFlowTable | count)
| order by FlowCount desc
Data Filtering and Aggregation:
// Combine multiple filters
NetworkFlowTable
| where BytesSrcToDest > 1000000 // Large uploads
| where DestPort == 443 // HTTPS
| where FlowDirection == "O" // Outbound
| summarize
TotalUploaded = sum(BytesSrcToDest),
FlowCount = count()
by SrcPublicIps
| order by TotalUploaded desc
// Group by multiple dimensions
NetworkFlowTable
| summarize
TotalBytes = sum(BytesSrcToDest + BytesDestToSrc),
AvgPackets = avg(PacketsSrcToDest + PacketsDestToSrc)
by SrcPublicIps, DestRegion, L4Protocol
| where TotalBytes > 10000000
Common KQL Operators:
// Filtering
| where DestPort == 80
| where SrcPublicIps in ("1.2.3.4", "5.6.7.8")
| where BytesSrcToDest > 1000
| where FlowStartTime > ago(7d)
// Aggregation
| summarize count() by SrcPublicIps
| summarize sum(TotalBytes), avg(PacketCount) by Protocol
| summarize make_list(DestIp) by SrcPublicIps
// Projection
| project SrcPublicIps, DestIp, TotalBytes
| project-away InternalField1, InternalField2
// Sorting
| sort by TotalBytes desc
| top 10 by FlowCount
| order by SrcPublicIps asc
// Distinct values
| distinct SrcPublicIps
| distinct DestIp, DestPort
// String operations
| where SrcPublicIps contains "192.168"
| where DestIp startswith "10."
| where packet_data matches regex "[A-Za-z0-9+/=]+"
// Type conversions
| extend BytesMB = BytesSrcToDest / 1048576.0
| extend FlowDuration = FlowEndTime - FlowStartTime
| extend DecodedPacket = base64_decode_tostring(packet_data)
Exporting Results:
// Export to CSV
NetworkFlowTable
| where DestPort == 443
| summarize TotalBytes = sum(BytesSrcToDest) by SrcPublicIps
| order by TotalBytes desc
// Click "Export to CSV" in Azure Data Explorer UI
// Save results to variable for reuse
let TopTalkers = NetworkFlowTable
| summarize TotalBytes = sum(BytesSrcToDest + BytesDestToSrc) by SrcPublicIps
| top 100 by TotalBytes desc;
TopTalkers
| join kind=inner (
NetworkFlowTable
| where DestPort in (80, 443)
) on SrcPublicIps
| summarize WebTraffic = sum(BytesSrcToDest) by SrcPublicIps
Best Practices:
- Always filter data early in the query pipeline to reduce processing overhead
- Use specific time ranges when working with large datasets to improve performance
- Test queries on small data samples before running on full datasets
- Use let statements to define reusable query components and intermediate results
- Leverage built-in functions (make_set, make_list, dcount) for efficient aggregation
- Comment complex queries to document logic and assumptions
- Export results to CSV for further analysis in spreadsheet tools or Python
Common Use Cases:
// Data exfiltration detection
NetworkFlowTable
| where BytesSrcToDest > 10000000 // Large outbound transfers
| where DestPort not in (80, 443) // Non-standard ports
| summarize TotalExfiltrated = sum(BytesSrcToDest) by SrcPublicIps, DestIp, DestPort
| order by TotalExfiltrated desc
// Beacon detection (regular interval connections)
NetworkFlowTable
| where SrcPublicIps == "suspicious_ip"
| summarize FlowTimes = make_list(FlowStartTime) by DestIp
| extend IntervalSeconds = array_sort_asc(FlowTimes)
// Analyze intervals for regularity
// Port scanning identification
NetworkFlowTable
| summarize UniqueDestPorts = dcount(DestPort) by SrcPublicIps
| where UniqueDestPorts > 100 // Contacted many different ports
| order by UniqueDestPorts desc
// Find connections to specific subnets
NetworkFlowTable
| where DestIp startswith "10.0." or DestIp startswith "192.168."
| summarize InternalConnections = count() by SrcPublicIps
| top 20 by InternalConnections desc
// Time-based analysis
NetworkFlowTable
| where FlowStartTime between (datetime(2025-01-01) .. datetime(2025-01-31))
| summarize HourlyFlows = count() by bin(FlowStartTime, 1h)
| render timechart
// Protocol distribution
NetworkFlowTable
| summarize FlowCount = count() by L4Protocol
| extend Percentage = FlowCount * 100.0 / toscalar(NetworkFlowTable | count)
| order by FlowCount desc
Data Filtering and Aggregation:
// Combine multiple filters
NetworkFlowTable
| where BytesSrcToDest > 1000000 // Large uploads
| where DestPort == 443 // HTTPS
| where FlowDirection == "O" // Outbound
| summarize
TotalUploaded = sum(BytesSrcToDest),
FlowCount = count()
by SrcPublicIps
| order by TotalUploaded desc
// Group by multiple dimensions
NetworkFlowTable
| summarize
TotalBytes = sum(BytesSrcToDest + BytesDestToSrc),
AvgPackets = avg(PacketsSrcToDest + PacketsDestToSrc)
by SrcPublicIps, DestRegion, L4Protocol
| where TotalBytes > 10000000
Common KQL Operators:
// Filtering
| where DestPort == 80
| where SrcPublicIps in ("1.2.3.4", "5.6.7.8")
| where BytesSrcToDest > 1000
| where FlowStartTime > ago(7d)
// Aggregation
| summarize count() by SrcPublicIps
| summarize sum(TotalBytes), avg(PacketCount) by Protocol
| summarize make_list(DestIp) by SrcPublicIps
// Projection
| project SrcPublicIps, DestIp, TotalBytes
| project-away InternalField1, InternalField2
// Sorting
| sort by TotalBytes desc
| top 10 by FlowCount
| order by SrcPublicIps asc
// Distinct values
| distinct SrcPublicIps
| distinct DestIp, DestPort
// String operations
| where SrcPublicIps contains "192.168"
| where DestIp startswith "10."
| where packet_data matches regex "[A-Za-z0-9+/=]+"
// Type conversions
| extend BytesMB = BytesSrcToDest / 1048576.0
| extend FlowDuration = FlowEndTime - FlowStartTime
| extend DecodedPacket = base64_decode_tostring(packet_data)
Exporting Results:
// Export to CSV
NetworkFlowTable
| where DestPort == 443
| summarize TotalBytes = sum(BytesSrcToDest) by SrcPublicIps
| order by TotalBytes desc
// Click "Export to CSV" in Azure Data Explorer UI
// Save results to variable for reuse
let TopTalkers = NetworkFlowTable
| summarize TotalBytes = sum(BytesSrcToDest + BytesDestToSrc) by SrcPublicIps
| top 100 by TotalBytes desc;
TopTalkers
| join kind=inner (
NetworkFlowTable
| where DestPort in (80, 443)
) on SrcPublicIps
| summarize WebTraffic = sum(BytesSrcToDest) by SrcPublicIps
Best Practices:
- Always filter data early in the query pipeline to reduce processing overhead
- Use specific time ranges when working with large datasets to improve performance
- Test queries on small data samples before running on full datasets
- Use let statements to define reusable query components and intermediate results
- Leverage built-in functions (make_set, make_list, dcount) for efficient aggregation
- Comment complex queries to document logic and assumptions
- Export results to CSV for further analysis in spreadsheet tools or Python
Common Use Cases:
// Data exfiltration detection
NetworkFlowTable
| where BytesSrcToDest > 10000000 // Large outbound transfers
| where DestPort not in (80, 443) // Non-standard ports
| summarize TotalExfiltrated = sum(BytesSrcToDest) by SrcPublicIps, DestIp, DestPort
| order by TotalExfiltrated desc
// Beacon detection (regular interval connections)
NetworkFlowTable
| where SrcPublicIps == "suspicious_ip"
| summarize FlowTimes = make_list(FlowStartTime) by DestIp
| extend IntervalSeconds = array_sort_asc(FlowTimes)
// Analyze intervals for regularity
// Port scanning identification
NetworkFlowTable
| summarize UniqueDestPorts = dcount(DestPort) by SrcPublicIps
| where UniqueDestPorts > 100 // Contacted many different ports
| order by UniqueDestPorts desc