Running Knowledge

Topics

Web Application Development
Security & Authentication
Network Forensics & Analysis
Database & Development
Migration & DevOps Automation
Password Recovery & Cryptography
Query Languages & Data Analysis

Web Application Development

Date Range Validation in Web Applications

Problem

Applications requiring date range inputs need validation to ensure dates are valid, properly formatted, and logically consistent. Users may enter invalid dates, dates in wrong formats, or dates that create logical conflicts (like an end date before a start date). Without validation, corrupted data enters the database causing reporting errors and operational problems.

Solution

Implement both client-side and server-side validation. Client-side validation provides immediate feedback using dynamic actions that check date format and logical relationships as users type. Server-side validation using PL/SQL expressions ensures data integrity even if client-side checks are bypassed. Use APEX's built-in date picker components with format masks, minimum/maximum date constraints, and validation error messages that clearly explain requirements.

Implementation

Server-Side Validation:

-- PL/SQL Expression Validation
:P_END_DATE >= :P_START_DATE

Client-Side Dynamic Action: Set minimum date on end date picker when start date changes using JavaScript expression and Set Value action.

Admin Override for Restricted Operations

Problem

Systems typically restrict certain operations when capacity is reached or deadlines pass. However, administrators occasionally need to perform these restricted operations for legitimate exceptions—late entries, special accommodations, or administrative corrections. Without override capability, administrators must manually manipulate database records, creating security risks and data integrity issues.

Solution

Implement role-based authorization that allows administrators to bypass operational restrictions while maintaining audit trails. Create conditional logic that checks both operation status and user role before blocking actions. Use APEX authorization schemes to show admin-only buttons or form fields. Add database triggers or application logic to log all override actions with administrator ID and timestamp for accountability.

Implementation

Authorization Scheme:

-- Create authorization scheme checking user role
SELECT COUNT(*)
FROM user_roles
WHERE username = :APP_USER
  AND role_name = 'ADMIN'

Conditional Operation Logic:

-- Allow operation if status open OR user is admin
IF (record_status = 'OPEN' AND current_count < max_capacity) 
   OR is_admin(:APP_USER) THEN
    -- Process operation
    INSERT INTO operations...
    
    -- Log override if admin bypassed restrictions
    IF record_status = 'CLOSED' THEN
        INSERT INTO admin_override_log...
    END IF;
END IF;

UI Element Authorization: Apply authorization scheme to admin-only buttons and form regions using Server-side Condition type "Authorization Scheme".

Duplicate Entry Prevention

Problem

Without duplicate detection, users can submit the same record multiple times, either accidentally through multiple form submissions or intentionally. This creates inaccurate counts, wastes limited capacity, complicates tracking, and generates data inconsistencies in reporting. Manual cleanup of duplicate entries is time-consuming and error-prone.

Solution

Implement database-level unique constraints combined with application-level validation checks. Create a composite unique constraint on the table using relevant ID columns. Add pre-insert validation in APEX that queries existing records before allowing submission. Provide clear error messaging when duplicates are detected. Use database triggers as an additional enforcement layer to prevent duplicates even if application logic is bypassed.

Implementation

Database Unique Constraint:

ALTER TABLE user_operations
ADD CONSTRAINT uk_user_operation 
UNIQUE (user_id, operation_id);

APEX Validation:

-- PL/SQL Function Body returning Boolean
DECLARE
    v_count NUMBER;
BEGIN
    SELECT COUNT(*)
    INTO v_count
    FROM user_operations
    WHERE user_id = :P_USER_ID
      AND operation_id = :P_OPERATION_ID;
    
    RETURN v_count = 0;
END;

Error Message: "This record already exists in the system."

Status Tracking with Checkbox Column

Problem

Administrators need a simple method to record status changes for tracked records. Manual status tracking through separate forms or spreadsheets is time-consuming and creates data synchronization issues. Without integrated tracking, generating status reports requires manual data reconciliation, and historical records may be incomplete or inaccurate.

Solution

Add a status boolean column to the tracking table. Implement an interactive grid or report in APEX that displays all records with an editable checkbox column for status. Use APEX's built-in row-level DML operations to update status records directly from the grid. Restrict checkbox editing to administrators using authorization schemes. Include timestamp and user tracking columns to maintain an audit trail of status modifications.

Implementation

Add Status Column:

ALTER TABLE operations
ADD (
    completed VARCHAR2(1) DEFAULT 'N',
    completed_by VARCHAR2(100),
    completed_date TIMESTAMP
);

-- Add check constraint
ALTER TABLE operations
ADD CONSTRAINT chk_completed CHECK (completed IN ('Y', 'N'));

Interactive Grid Configuration:

Source: SELECT user_name, operation_name, completed FROM operations
Completed column: Display as Checkbox, editable
Enable Save button for DML operations
Authorization: Admin role only

Audit Trail Trigger:

CREATE OR REPLACE TRIGGER trg_status_audit
BEFORE UPDATE OF completed ON operations
FOR EACH ROW
BEGIN
    IF :NEW.completed = 'Y' AND :OLD.completed = 'N' THEN
        :NEW.completed_by := V('APP_USER');
        :NEW.completed_date := SYSTIMESTAMP;
    END IF;
END;

Modal Dialog Implementation with Date Validation

Problem

Data entry workflows often require multiple input fields that clutter the main page interface. Presenting all creation fields inline reduces usability and makes the interface overwhelming. Additionally, users need immediate feedback when entering date ranges to prevent invalid data submission. Without modal dialogs, form validation errors require full page refreshes, disrupting the user workflow.

Solution

Implement APEX modal dialog pages for data entry that open in an overlay window without navigating away from the main page. Configure the modal to include date pickers with built-in validation. Use dynamic actions to enforce date logic client-side while the modal is open. Handle modal closure events to refresh the parent page's data list. Use APEX's dialog closed dynamic action to detect successful submissions and update the underlying data without full page reload.

Implementation

Create Modal Dialog Page:

Page Mode: Modal Dialog
Add items: P_RECORD_NAME, P_START_DATE, P_END_DATE
Add Process: Insert record
Add Branch: Close dialog on success

Open Modal Button (Parent Page):

-- Button action: Redirect to Page
-- Target: Page in this application
-- Page: [modal page number]
-- Clear Cache: [modal page number]

Dynamic Action on Start Date Change:

-- Event: Change
-- Selection Type: Item
-- Item: P_START_DATE
-- Action: Set Value
-- Set Type: JavaScript Expression
-- JavaScript: $v('P_START_DATE')
-- Affected Element: P_END_DATE
-- Attribute: Minimum Value

Dialog Closed Event (Parent Page):

-- Event: Dialog Closed
-- Action: Refresh region containing data list

Dynamic Actions to Prevent End Date Before Start Date

Problem

Date range inputs require real-time validation to prevent logical errors. Users may select an end date that precedes the start date either through manual entry or date picker selection. Without immediate client-side feedback, users only discover the error after form submission, forcing them to correct and resubmit. This degrades user experience and increases support requests. Additionally, client-side validation alone is insufficient—malicious users can bypass browser controls.

Solution

Implement layered validation using APEX dynamic actions for immediate client-side feedback combined with server-side PL/SQL validation for security. Create a Change event dynamic action on the start date item that updates the end date picker's minimum allowable value. Add a second dynamic action on the end date item that displays inline error messages when an invalid date is selected. Complement these with page-level PL/SQL validation that executes during form submission, preventing invalid data from reaching the database regardless of client-side manipulation.

Implementation

Dynamic Action 1: Update End Date Minimum

Event: Change [P_START_DATE]
True Action: Set Value
  Set Type: JavaScript Expression
  JavaScript: $v('P_START_DATE')
  Affected Element: P_END_DATE
  Attribute to Set: Minimum Value

Dynamic Action 2: Validate End Date

Event: Change [P_END_DATE]
Condition: JavaScript Expression
  JavaScript: new Date($v('P_END_DATE')) < new Date($v('P_START_DATE'))
True Action: Show error message
  Message: "End date cannot be before start date"
False Action: Clear error message

Server-Side Validation (Required):

-- Validation Type: PL/SQL Expression
-- Expression: :P_END_DATE >= :P_START_DATE
-- Error Message: "End date must be on or after start date"
-- Validation Location: On Submit - After Computations and Validations

Server-Side Validation for Date Logic

Problem

Client-side validation can be bypassed by disabling JavaScript, manipulating browser developer tools, or submitting data through API calls. Relying solely on client-side validation creates security vulnerabilities where invalid or malicious data reaches the database. Date logic violations—such as end dates before start dates, past dates for future events, or dates outside acceptable ranges—must be caught server-side to maintain data integrity regardless of how the data is submitted.

Solution

Implement comprehensive server-side validation using PL/SQL validation rules in APEX. Create multiple validation points that check date ranges, date ordering, and business logic constraints. Use PL/SQL expressions that return boolean values or validation functions that raise exceptions with specific error messages. Position validations to execute after computations but before DML operations. Ensure validation error messages clearly indicate the problem and required correction. Log validation failures for security monitoring and audit purposes.

Implementation

Basic Date Range Validation:

-- Validation Type: PL/SQL Expression
-- PL/SQL Expression:
:P_END_DATE >= :P_START_DATE

-- Error Message: "End date must be on or after start date"
-- Validation Execution Scope: On Submit - After Computations and Validations

Future Date Validation:

-- Ensure start date is not in the past
:P_START_DATE >= TRUNC(SYSDATE)

-- Error Message: "Start date cannot be in the past"

Complex Validation Function:

-- PL/SQL Function Body returning Boolean
DECLARE
    v_valid BOOLEAN := TRUE;
BEGIN
    -- Check date order
    IF :P_END_DATE < :P_START_DATE THEN
        v_valid := FALSE;
    END IF;
    
    -- Check minimum duration (at least 1 day)
    IF (:P_END_DATE - :P_START_DATE) < 1 THEN
        v_valid := FALSE;
    END IF;
    
    -- Check maximum advance scheduling (not more than 1 year out)
    IF :P_START_DATE > ADD_MONTHS(SYSDATE, 12) THEN
        v_valid := FALSE;
    END IF;
    
    RETURN v_valid;
END;

Validation with Audit Logging:

CREATE OR REPLACE PROCEDURE log_validation_failure(
    p_user VARCHAR2,
    p_error VARCHAR2
) AS
    PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
    INSERT INTO validation_audit_log
    VALUES (SYSDATE, p_user, p_error);
    COMMIT;
END;

Lookup Table Pattern Instead of Hardcoded CASE Statements

Problem

Hardcoding mappings in application logic using CASE statements creates maintenance overhead and requires code deployments for simple data changes. When new values are added or mapping conventions change, developers must modify PL/SQL code, test the changes, and deploy updates to production. This tight coupling between data and code slows operational agility, increases deployment risk, and prevents administrators from managing mappings independently through the application interface.

Solution

Create a dedicated lookup table that stores mappings as data rather than code. Replace CASE statement logic with simple SELECT queries against this table. Build an administrative interface in APEX that allows authorized users to add, modify, or remove mappings without code changes. Implement proper constraints and validation to ensure data integrity. This data-driven approach separates configuration from logic, enables self-service administration, and eliminates code deployments for routine mapping updates.

Implementation

Create Lookup Table:

CREATE TABLE value_mappings (
    source_value VARCHAR2(100) PRIMARY KEY,
    mapped_value VARCHAR2(100) NOT NULL,
    active VARCHAR2(1) DEFAULT 'Y' CHECK (active IN ('Y','N')),
    created_date TIMESTAMP DEFAULT SYSTIMESTAMP,
    created_by VARCHAR2(100),
    modified_date TIMESTAMP,
    modified_by VARCHAR2(100)
);

Replace CASE Statement Logic:

-- Old approach with CASE
v_result := CASE :P_INPUT_VALUE
    WHEN 'Value A' THEN 'RESULT_A'
    WHEN 'Value B' THEN 'RESULT_B'
    ELSE NULL
END;

-- New approach with lookup table
BEGIN
    SELECT mapped_value INTO v_result
    FROM value_mappings
    WHERE source_value = :P_INPUT_VALUE
      AND active = 'Y';
EXCEPTION
    WHEN NO_DATA_FOUND THEN
        RAISE_APPLICATION_ERROR(-20001, 'No mapping found for input value');
END;

Admin Maintenance Page:

Create Interactive Grid on value_mappings table
Enable inline editing for mapped_value column
Add authorization scheme: Admin role only
Include audit columns: created_by, created_date, modified_by, modified_date

Audit Trigger:

CREATE OR REPLACE TRIGGER trg_mapping_audit
BEFORE INSERT OR UPDATE ON value_mappings
FOR EACH ROW
BEGIN
    IF INSERTING THEN
        :NEW.created_by := V('APP_USER');
        :NEW.created_date := SYSTIMESTAMP;
    ELSIF UPDATING THEN
        :NEW.modified_by := V('APP_USER');
        :NEW.modified_date := SYSTIMESTAMP;
    END IF;
END;

Security & Authentication

Oracle Certificate Import for OIDC Authentication

Problem

Oracle APEX applications using OpenID Connect (OIDC) for authentication must validate SSL/TLS certificates when making HTTPS requests to identity providers. Without proper certificate chain validation, APEX cannot establish secure connections to authentication endpoints. The UTL_HTTP package used by APEX for OIDC communication requires trusted certificates in an Oracle Wallet. Missing or incorrectly configured certificates cause authentication failures with "ORA-29024: Certificate validation failure" or similar errors, blocking all user logins.

Solution

Import the complete certificate chain for the OIDC provider into an Oracle Wallet and configure APEX to use it for HTTPS connections. Download the root and intermediate certificates from the identity provider's server. Use Oracle Wallet Manager or orapki command-line utility to create a wallet and import certificates. Configure sqlnet.ora to point to the wallet location and enable SSL_SERVER_DN_MATCH for hostname verification. Update APEX instance settings to use the wallet for outbound HTTPS connections.

Implementation

Download Certificate Chain:

-- Using OpenSSL to get certificate chain
openssl s_client -connect auth.provider.com:443 -showcerts

-- Export each certificate from the output to separate .pem files:
-- root_ca.pem, intermediate_ca.pem

Create Oracle Wallet:

-- Create wallet directory
mkdir -p /path/to/wallet

-- Create wallet (auto_login for automatic opening)
orapki wallet create -wallet /path/to/wallet -auto_login -pwd WalletPassword123

-- Add certificates to wallet
orapki wallet add -wallet /path/to/wallet -trusted_cert -cert root_ca.pem -pwd WalletPassword123
orapki wallet add -wallet /path/to/wallet -trusted_cert -cert intermediate_ca.pem -pwd WalletPassword123

-- Verify certificates in wallet
orapki wallet display -wallet /path/to/wallet

Configure sqlnet.ora:

WALLET_LOCATION =
  (SOURCE =
    (METHOD = FILE)
    (METHOD_DATA =
      (DIRECTORY = /path/to/wallet)
    )
  )

SSL_CLIENT_AUTHENTICATION = FALSE
SSL_SERVER_DN_MATCH = TRUE

Configure APEX for Wallet:

-- Set wallet path in APEX instance
BEGIN
    APEX_INSTANCE_ADMIN.SET_PARAMETER(
        p_parameter => 'WALLET_PATH',
        p_value     => 'file:/path/to/wallet'
    );
    COMMIT;
END;

-- Test HTTPS connection
BEGIN
    APEX_WEB_SERVICE.MAKE_REST_REQUEST(
        p_url => 'https://auth.provider.com/.well-known/openid-configuration',
        p_http_method => 'GET'
    );
END;

Trust Store Configuration (sqlnet.ora) for Certificate Validation

Problem

Oracle database connections using SSL/TLS require proper trust store configuration to validate server certificates. When sqlnet.ora is misconfigured or missing trust store parameters, database clients cannot verify certificate chains, leading to either connection failures or insecure connections that skip validation. Applications relying on UTL_HTTP, UTL_TCP, or database links for external HTTPS calls fail with certificate validation errors. Without proper trust store configuration, organizations cannot enforce certificate-based security policies or prevent man-in-the-middle attacks.

Solution

Configure sqlnet.ora with proper WALLET_LOCATION, SSL_CLIENT_AUTHENTICATION, and SSL_SERVER_DN_MATCH parameters. Define the trust store directory containing Oracle Wallet files with trusted root and intermediate certificates. Enable server certificate validation and hostname verification to ensure connections only succeed when certificates are valid and match the target server. Set appropriate cipher suites and TLS versions to meet security requirements. Test configuration with both database connections and UTL_HTTP calls to verify proper certificate validation.

Implementation

Basic sqlnet.ora Configuration:

# Wallet location for trusted certificates
WALLET_LOCATION =
  (SOURCE =
    (METHOD = FILE)
    (METHOD_DATA =
      (DIRECTORY = /u01/app/oracle/admin/wallet)
    )
  )

# Client authentication not required for trust store
SSL_CLIENT_AUTHENTICATION = FALSE

# Enable hostname verification (recommended)
SSL_SERVER_DN_MATCH = TRUE

# Specify TLS version (1.2 minimum recommended)
SSL_VERSION = 1.2

# Cipher suites (optional - restricts to secure ciphers)
SSL_CIPHER_SUITES = 
  (TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
   TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256)

Test Certificate Validation:

-- Test UTL_HTTP with HTTPS endpoint
DECLARE
    l_response UTL_HTTP.RESP;
    l_text VARCHAR2(32767);
BEGIN
    UTL_HTTP.SET_WALLET('file:/u01/app/oracle/admin/wallet');
    
    l_response := UTL_HTTP.BEGIN_REQUEST(
        url => 'https://www.example.com',
        method => 'GET'
    );
    
    UTL_HTTP.READ_TEXT(l_response, l_text);
    UTL_HTTP.END_RESPONSE(l_response);
    
    DBMS_OUTPUT.PUT_LINE('Success: ' || SUBSTR(l_text, 1, 100));
EXCEPTION
    WHEN OTHERS THEN
        DBMS_OUTPUT.PUT_LINE('Error: ' || SQLERRM);
        UTL_HTTP.END_RESPONSE(l_response);
END;

Verify Wallet Contents:

-- Display trusted certificates in wallet
orapki wallet display -wallet /u01/app/oracle/admin/wallet

-- Expected output shows:
-- Trusted Certificates:
--   Subject: CN=DigiCert Global Root CA
--   Subject: CN=Let's Encrypt Authority X3

Common Configuration Issues:

Incorrect DIRECTORY path in WALLET_LOCATION
Missing cwallet.sso file in wallet directory
SSL_SERVER_DN_MATCH=TRUE with certificate CN mismatch
Expired or incomplete certificate chain in wallet
File permissions preventing Oracle process from reading wallet

Production Best Practices:

# Enable detailed SSL debugging (remove in production)
TRACE_LEVEL_CLIENT = 16
TRACE_DIRECTORY_CLIENT = /u01/app/oracle/admin/trace

# Network encryption (optional but recommended)
SQLNET.ENCRYPTION_CLIENT = REQUIRED
SQLNET.CRYPTO_CHECKSUM_CLIENT = REQUIRED

Custom Authentication Schemes in APEX

Problem

Default APEX authentication schemes (Application Express Accounts, Database Accounts) don't integrate with enterprise identity management systems or support custom authentication workflows. Organizations using LDAP, Active Directory, SAML, or custom authentication APIs cannot authenticate users without building custom integration logic. Without proper authentication scheme implementation, applications either operate with insecure workarounds or require manual user account management that doesn't sync with enterprise identity sources. Session management, user attribute propagation, and single sign-on capabilities remain unavailable.

Solution

Create custom authentication schemes in APEX that integrate with external identity providers through PL/SQL authentication functions. Implement the authentication function interface that APEX expects, including credential validation, session establishment, and logout handling. Use UTL_HTTP or web service calls to validate credentials against external systems. Store minimal session data in APEX session state while retrieving user attributes from the identity provider. Configure post-authentication procedures to populate application-specific user context. Implement proper error handling for authentication failures and timeout scenarios.

Implementation

Create Authentication Function:

CREATE OR REPLACE FUNCTION custom_auth_function (
    p_username IN VARCHAR2,
    p_password IN VARCHAR2
) RETURN BOOLEAN
IS
    l_valid BOOLEAN := FALSE;
    l_response CLOB;
BEGIN
    -- Call external authentication API
    l_response := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
        p_url => 'https://auth.company.com/validate',
        p_http_method => 'POST',
        p_body => JSON_OBJECT(
            'username' VALUE p_username,
            'password' VALUE p_password
        )
    );
    
    -- Parse response
    IF JSON_VALUE(l_response, '$.authenticated') = 'true' THEN
        l_valid := TRUE;
        
        -- Store user attributes in session
        APEX_UTIL.SET_SESSION_STATE(
            p_name => 'USER_EMAIL',
            p_value => JSON_VALUE(l_response, '$.email')
        );
        
        APEX_UTIL.SET_SESSION_STATE(
            p_name => 'USER_ROLE',
            p_value => JSON_VALUE(l_response, '$.role')
        );
    END IF;
    
    RETURN l_valid;
EXCEPTION
    WHEN OTHERS THEN
        APEX_DEBUG.ERROR('Authentication error: ' || SQLERRM);
        RETURN FALSE;
END custom_auth_function;

Configure Authentication Scheme in APEX:

Navigate to: Shared Components → Authentication Schemes
Create → Based on a pre-configured scheme → Custom
Authentication Function: custom_auth_function
Session Not Valid: Redirect to login page
Session Valid: Use current session

Post-Authentication Procedure:

CREATE OR REPLACE PROCEDURE post_auth_procedure
IS
    l_user_id NUMBER;
BEGIN
    -- Log authentication event
    INSERT INTO auth_audit_log (
        username,
        auth_time,
        ip_address
    ) VALUES (
        :APP_USER,
        SYSTIMESTAMP,
        OWA_UTIL.GET_CGI_ENV('REMOTE_ADDR')
    );
    
    -- Load user preferences
    SELECT user_id INTO l_user_id
    FROM app_users
    WHERE username = :APP_USER;
    
    APEX_UTIL.SET_SESSION_STATE('USER_ID', l_user_id);
    
    COMMIT;
END post_auth_procedure;

Logout URL Configuration:

-- Logout URL in authentication scheme settings
&APP_URL./f?p=&APP_ID.:LOGOUT:&SESSION.

-- Custom logout procedure to clear external session
CREATE OR REPLACE PROCEDURE custom_logout
IS
BEGIN
    -- Call external logout endpoint
    APEX_WEB_SERVICE.MAKE_REST_REQUEST(
        p_url => 'https://auth.company.com/logout',
        p_http_method => 'POST',
        p_body => JSON_OBJECT('session_id' VALUE V('APP_SESSION'))
    );
    
    -- Clear APEX session
    APEX_AUTHENTICATION.LOGOUT(
        p_this_app => V('APP_ID'),
        p_next_app_page_sess => V('APP_ID') || ':LOGIN'
    );
END custom_logout;

Session Verification (sentry function):

CREATE OR REPLACE FUNCTION session_verify
RETURN BOOLEAN
IS
    l_valid BOOLEAN := FALSE;
    l_last_activity DATE;
BEGIN
    -- Check session timeout (30 minutes)
    SELECT MAX(created_on) INTO l_last_activity
    FROM apex_workspace_activity_log
    WHERE apex_user = :APP_USER
      AND apex_session_id = :APP_SESSION;
    
    IF l_last_activity > SYSDATE - (30/1440) THEN
        l_valid := TRUE;
    END IF;
    
    RETURN l_valid;
EXCEPTION
    WHEN NO_DATA_FOUND THEN
        RETURN FALSE;
END session_verify;

Web Application Security Headers Configuration

Problem

Static HTML websites lack built-in security headers that protect against common web vulnerabilities. Without proper HTTP security headers, sites remain vulnerable to clickjacking attacks, cross-site scripting (XSS), MIME type confusion, and other browser-based exploits. Default web server configurations rarely include comprehensive security headers. Sites without HTTPS enforcement allow man-in-the-middle attacks. Missing Content Security Policy (CSP) enables injection attacks. Absent referrer policies leak sensitive URL information to third parties. Security scanning tools flag missing headers as critical vulnerabilities.

Solution

Configure comprehensive HTTP security headers using Apache .htaccess file to protect static HTML websites. Implement X-Frame-Options to prevent clickjacking, X-Content-Type-Options to block MIME sniffing, X-XSS-Protection for legacy browser XSS filters, Referrer-Policy to control information leakage, Permissions-Policy to disable unnecessary browser features, Content-Security-Policy to restrict resource loading, and Strict-Transport-Security (HSTS) to enforce HTTPS. Force HTTPS redirection for all traffic. These headers provide defense-in-depth against common web attacks without requiring application code changes.

Implementation

Create .htaccess File:

# Security Headers

    # HTTP Strict Transport Security (HSTS)
    Header always set Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
    
    # Prevent clickjacking
    Header always set X-Frame-Options "SAMEORIGIN"
    
    # Prevent MIME type sniffing
    Header always set X-Content-Type-Options "nosniff"
    
    # Enable XSS filter
    Header always set X-XSS-Protection "1; mode=block"
    
    # Control referrer information
    Header always set Referrer-Policy "strict-origin-when-cross-origin"
    
    # Disable unnecessary features
    Header always set Permissions-Policy "geolocation=(), microphone=(), camera=()"
    
    # Content Security Policy
    Header always set Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data:; font-src 'self'; connect-src 'self'; frame-ancestors 'self';"


# Force HTTPS

    RewriteEngine On
    RewriteCond %{HTTPS} off
    RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]


# DirectoryIndex
DirectoryIndex index.html index.htm

Security Header Explanations:

Strict-Transport-Security (HSTS): Forces browsers to only connect via HTTPS for specified duration (1 year). Prevents downgrade attacks and insecure connections.
X-Frame-Options: Prevents site from being embedded in iframes on other domains. Blocks clickjacking attacks where attackers overlay transparent frames to capture clicks.
X-Content-Type-Options: Prevents browsers from MIME-sniffing responses away from declared content-type. Stops malicious content execution through type confusion.
X-XSS-Protection: Enables browser's built-in XSS filter (legacy browsers). Detects and blocks reflected cross-site scripting attempts.
Referrer-Policy: Controls what referrer information is sent with requests. Prevents leaking sensitive URL parameters to external sites.
Permissions-Policy: Disables browser features not needed by the site (geolocation, camera, microphone). Reduces attack surface.
Content-Security-Policy: Restricts sources from which content can be loaded. Prevents injection attacks by controlling script, style, and resource origins.

Verification Steps:

# 1. Upload .htaccess to web server root directory (same location as index.html)

# 2. Test HTTPS enforcement
# Visit: http://yourdomain.com
# Should automatically redirect to: https://yourdomain.com

# 3. Verify headers using curl
curl -I https://yourdomain.com

# Expected output includes:
# strict-transport-security: max-age=31536000; includeSubDomains; preload
# x-frame-options: SAMEORIGIN
# x-content-type-options: nosniff
# x-xss-protection: 1; mode=block
# referrer-policy: strict-origin-when-cross-origin
# permissions-policy: geolocation=(), microphone=(), camera=()
# content-security-policy: default-src 'self'; script-src 'self'; ...

Online Security Scanning:

# Run SSL/TLS configuration test
https://www.ssllabs.com/ssltest/
# Target Grade: A or A+

# Run HTTP security headers test
https://securityheaders.com
# Target Grade: A or A+

# Check for common vulnerabilities
https://observatory.mozilla.org
# Target Score: 90+

Content Security Policy Customization:

# Allow inline styles (required for many sites)
style-src 'self' 'unsafe-inline';

# Allow external fonts (Google Fonts, etc.)
font-src 'self' https://fonts.gstatic.com;

# Allow external scripts (CDN resources)
script-src 'self' https://cdn.example.com;

# Allow images from data URIs and external domains
img-src 'self' data: https:;

# Report CSP violations to monitoring endpoint
report-uri https://yourdomain.com/csp-report;

Common Issues and Fixes:

CSP blocking inline styles: Add 'unsafe-inline' to style-src or move styles to external CSS files
Mixed content warnings: Ensure all resources (images, scripts, CSS) load via HTTPS
HSTS preventing HTTP access: Clear browser HSTS cache or wait for max-age expiration
.htaccess not applying: Verify mod_headers and mod_rewrite are enabled in Apache configuration
Headers not showing in curl: Check that .htaccess is in correct directory and has proper file permissions

Network Forensics & Analysis

IRC Protocol Analysis and Traffic Reconstruction

Problem

Internet Relay Chat (IRC) remains a communication protocol used in various contexts including legitimate technical communities and potential security incidents. When investigating network traffic, analysts need to extract and reconstruct IRC conversations from packet captures to understand communication patterns, identify participants, and analyze message content. Raw packet captures contain fragmented TCP streams that must be reassembled and parsed according to IRC protocol specifications. Without proper analysis techniques, critical communication details remain hidden in binary network data.

Solution

Use Wireshark's protocol analysis capabilities to filter, follow, and extract IRC traffic from packet captures. Apply display filters to isolate IRC traffic (typically port 6667 for plaintext, 6697 for SSL), use TCP stream following to reconstruct complete conversations, and parse IRC protocol commands (JOIN, PRIVMSG, NICK, MODE) to identify participants and message content. Extract server information from connection handshakes and MOTD sequences. Use Wireshark's statistics features to identify communication patterns and timeline analysis.

Implementation

Filter IRC Traffic:

# Standard IRC ports
tcp.port == 6667

# SSL/TLS IRC
tcp.port == 6697

# Follow TCP stream for complete conversation
Right-click packet → Follow → TCP Stream

Extract User Information:

# IRC user format: :nickname!username@hostname
# Extract from JOIN, PRIVMSG, NICK commands

# Example IRC protocol messages:
:user!~user@host.example.com JOIN #channel
:user!~user@host.example.com PRIVMSG #channel :message content
:user!~user@host.example.com NICK newnickname

Python Script for IRC Log Parsing:

import re

def extract_irc_users(log_content):
    """Extract unique usernames from IRC log"""
    users = set()
    
    # Match IRC user format: :nickname!username@hostname
    pattern = r'^:([^!]+)!'
    
    for line in log_content.split('\n'):
        match = re.match(pattern, line)
        if match:
            users.add(match.group(1))
    
    return sorted(users)

def parse_irc_messages(log_content):
    """Parse PRIVMSG commands from IRC log"""
    messages = []
    
    for line in log_content.split('\n'):
        if 'PRIVMSG' in line:
            # Extract sender, channel/target, and message
            match = re.match(r':([^!]+)!.*PRIVMSG ([^ ]+) :(.*)', line)
            if match:
                messages.append({
                    'sender': match.group(1),
                    'target': match.group(2),
                    'message': match.group(3)
                })
    
    return messages

Identify IRC Server:

# Look for server identification in numeric replies:
# 001 RPL_WELCOME - Welcome message with server name
# 004 RPL_MYINFO - Server name, version, user/channel modes
# 375-376 - MOTD (Message of the Day)

# Wireshark filter for connection handshake:
irc and (frame contains "001" or frame contains "004")

Encrypted Traffic Analysis and Decryption

Problem

Network traffic often contains encrypted messages using PGP/GPG, SSL/TLS, or other cryptographic protocols. During security investigations or forensic analysis, encrypted content must be identified, extracted, and potentially decrypted to understand communication content. PGP-encrypted messages appear in network streams as Base64-encoded blocks that require proper key material for decryption. Without identifying encryption methods and locating corresponding keys, encrypted communications remain opaque to analysis.

Solution

Identify encrypted content by recognizing PGP message headers, certificate exchanges, and encryption protocol patterns in packet captures. Extract encrypted message blocks from network streams or file transfers. Use GPG/OpenPGP tools to decrypt messages when private keys are available. For SSL/TLS traffic, use session keys or private certificates to decrypt HTTPS communications in Wireshark. Analyze certificate chains to understand trust relationships and identify certificate authorities.

Implementation

Identify PGP Encrypted Content:

# PGP message markers in traffic:
-----BEGIN PGP MESSAGE-----
-----BEGIN PGP PRIVATE KEY BLOCK-----
-----BEGIN PGP PUBLIC KEY BLOCK-----

# Wireshark search for PGP content:
frame contains "BEGIN PGP"

# Extract from HTTP, FTP, or other protocols
http.file_data contains "PGP MESSAGE"

Extract and Decrypt PGP Messages:

# Save encrypted message to file
echo "-----BEGIN PGP MESSAGE-----
...encrypted content...
-----END PGP MESSAGE-----" > encrypted.asc

# Import private key
gpg --import private_key.asc

# Decrypt message
gpg --decrypt encrypted.asc > decrypted.txt

# Or using Python
import gnupg

gpg = gnupg.GPG()
gpg.import_keys(key_data)
decrypted = gpg.decrypt(encrypted_data)
print(decrypted.data.decode('utf-8'))

SSL/TLS Decryption in Wireshark:

# Using pre-master secret (modern approach):
# 1. Set SSLKEYLOGFILE environment variable before running browser
# 2. In Wireshark: Edit → Preferences → Protocols → TLS
# 3. Set "(Pre)-Master-Secret log filename" to the key log file

# Using RSA private key (legacy):
# Edit → Preferences → Protocols → TLS → RSA keys list
# Add: IP, Port, Protocol, Key file path

Certificate Chain Analysis:

# View TLS handshake details:
tls.handshake.type == 11  # Certificate message

# Extract certificate from packet:
# Expand: Transport Layer Security → Handshake Protocol → Certificate
# Right-click → Export Packet Bytes

# Analyze certificate with OpenSSL:
openssl x509 -in cert.der -inform DER -text -noout

# Shows: Issuer, Subject, Validity dates, Key usage, etc.

DNS Analysis and Domain Investigation

Problem

DNS queries and responses contain critical infrastructure information including domain-to-IP mappings, nameserver configurations, and service discovery details. During network investigations, analysts must identify what domains were queried, what IP addresses were returned, and what nameservers are authoritative for specific domains. DNS traffic analysis reveals communication patterns, identifies malicious domains, and maps network infrastructure. Without proper DNS analysis, understanding of network topology and external connections remains incomplete.

Solution

Use Wireshark to filter and analyze DNS traffic, examining both queries and responses. Extract A records (IPv4 addresses), AAAA records (IPv6 addresses), NS records (nameservers), SOA records (zone authority), and other DNS record types. Correlate DNS responses with subsequent network connections to verify IP address usage. Use external tools (dig, nslookup) to verify current DNS configurations and compare with captured traffic. Identify DNS anomalies including fast-flux domains, DGA patterns, or DNS tunneling attempts.

Implementation

Filter DNS Traffic:

# All DNS traffic
dns

# DNS queries only
dns.flags.response == 0

# DNS responses only
dns.flags.response == 1

# Query for specific domain
dns.qry.name contains "example.com"

# Specific record types
dns.qry.type == 1   # A record (IPv4)
dns.qry.type == 2   # NS record (nameserver)
dns.qry.type == 6   # SOA record (Start of Authority)
dns.qry.type == 28  # AAAA record (IPv6)

Extract Domain-to-IP Mappings:

# Wireshark Statistics → Resolved Addresses
# Shows all DNS name resolutions in capture

# Export DNS queries and responses:
tshark -r capture.pcap -Y "dns" -T fields \
  -e dns.qry.name -e dns.a -e dns.aaaa \
  | sort -u

# Output format: domain, IPv4, IPv6

Identify Authoritative Nameservers:

# Filter for SOA records in responses:
dns.resp.type == 6

# SOA record contains:
# - Primary nameserver (MNAME)
# - Responsible party email
# - Serial number, refresh, retry, expire timers

# Extract primary nameserver from packet details:
# Domain Name System → Answers → Start of Authority
# Primary name server: ns1.example.com

DNS Query Analysis Script:

#!/usr/bin/env python3
import pyshark

def analyze_dns_traffic(pcap_file):
    """Extract DNS queries and responses from pcap"""
    
    capture = pyshark.FileCapture(
        pcap_file, 
        display_filter='dns'
    )
    
    dns_records = []
    
    for packet in capture:
        if hasattr(packet, 'dns'):
            dns = packet.dns
            
            # Extract query information
            if hasattr(dns, 'qry_name'):
                query = {
                    'query': dns.qry_name,
                    'type': dns.qry_type if hasattr(dns, 'qry_type') else None,
                    'response_ip': dns.a if hasattr(dns, 'a') else None,
                    'timestamp': packet.sniff_time
                }
                dns_records.append(query)
    
    capture.close()
    return dns_records

# Usage
records = analyze_dns_traffic('capture.pcap')
for record in records:
    print(f"{record['query']} → {record['response_ip']}")

Verify DNS Configuration with External Tools:

# Query A record
dig example.com A

# Query specific nameserver
dig @8.8.8.8 example.com

# Query SOA record for authoritative server
dig example.com SOA

# Trace DNS resolution path
dig +trace example.com

# Check all record types
dig example.com ANY

Network Connection Pattern Analysis

Problem

Understanding which hosts communicated with each other, traffic volumes, and connection patterns is essential for network forensics. Raw packet captures contain thousands of individual packets that must be aggregated into meaningful conversations and statistics. Identifying high-volume connections, unusual traffic patterns, or connections to suspicious IP addresses requires analysis tools that summarize packet-level data into connection-level insights. Manual packet inspection is impractical for captures containing millions of packets.

Solution

Use Wireshark's Statistics features to aggregate packet data into conversation summaries, protocol hierarchies, and endpoint statistics. Analyze conversations to identify source-destination pairs, traffic volumes, and connection durations. Use protocol hierarchy statistics to understand traffic composition. Export conversation data to CSV for additional analysis in spreadsheet tools or scripts. Apply GeoIP databases to identify geographic locations of IP addresses. Correlate connection patterns with timestamps to build activity timelines.

Implementation

View Network Conversations:

# Wireshark: Statistics → Conversations
# Shows all source-destination pairs with:
# - Packet count
# - Byte count  
# - Duration
# - Relative start time

# Tabs available:
# - Ethernet (MAC addresses)
# - IPv4 (IP addresses)
# - IPv6
# - TCP (includes ports)
# - UDP (includes ports)

# Export to CSV for analysis:
# Copy button → Paste into spreadsheet or text file

Identify High-Traffic Connections:

# Filter by IP address after identifying in conversations:
ip.addr == 192.168.1.100

# Filter specific conversation:
ip.addr == 192.168.1.100 && ip.addr == 10.0.0.1

# Sort conversations by:
# - Packets (click Packets column header)
# - Bytes (click Bytes column header)  
# - Duration (click Duration column header)

Protocol Distribution Analysis:

# Wireshark: Statistics → Protocol Hierarchy
# Shows percentage breakdown of protocols:
# - Ethernet
#   - IPv4
#     - TCP (80% of traffic)
#       - HTTP
#       - HTTPS
#     - UDP (20% of traffic)
#       - DNS
#       - DHCP

# Identify unusual protocols or unexpected traffic

Extract IP Addresses for Bulk Analysis:

# Export unique destination IPs from specific source:
tshark -r capture.pcap \
  -Y "ip.src == 10.0.0.7" \
  -T fields -e ip.dst \
  | sort -u > destination_ips.txt

# Bulk GeoIP lookup (using online service or local database)
# Identify country, city, organization for each IP

# Example Python script for bulk IP analysis:
import requests

def geolocate_ips(ip_list):
    """Get geographic location for list of IPs"""
    results = []
    
    for ip in ip_list:
        # Using ipapi.co as example (rate limited)
        response = requests.get(f'https://ipapi.co/{ip}/json/')
        if response.status_code == 200:
            data = response.json()
            results.append({
                'ip': ip,
                'country': data.get('country_name'),
                'city': data.get('city'),
                'org': data.get('org')
            })
    
    return results

Timeline Analysis:

# Statistics → I/O Graph
# Visualize traffic over time
# X-axis: Time
# Y-axis: Packets, bytes, or advanced metrics

# Create filters for different traffic types:
# Graph 1: tcp (TCP traffic - blue)
# Graph 2: udp (UDP traffic - red)  
# Graph 3: dns (DNS traffic - green)

# Identify traffic spikes, patterns, or anomalies

Endpoint Statistics:

# Statistics → Endpoints → IPv4
# Lists all IP addresses in capture with:
# - Total packets sent/received
# - Total bytes sent/received
# - Geographic location (if GeoIP database configured)

# Identify most active hosts
# Sort by Packets or Bytes columns

File Extraction from Network Traffic

Problem

Network traffic often includes file transfers via HTTP, FTP, SMB, or other protocols. During forensic investigations, analysts need to extract these files from packet captures to analyze their content, identify malware, or reconstruct incident timelines. Files are fragmented across multiple packets and protocols, requiring reassembly of TCP streams and protocol-specific parsing. Binary file data mixed with protocol headers must be separated and reconstructed into usable files.

Solution

Use Wireshark's file export capabilities to extract transferred files from HTTP, SMB, and other protocols. Follow TCP streams to manually extract file content from FTP data connections. Use Wireshark's "Export Objects" feature for automatic file extraction from HTTP traffic. For encrypted transfers, decrypt the traffic first before attempting file extraction. Verify extracted files using hash values and file headers. Use protocol-specific tools (like NetworkMiner) for automated file extraction across multiple protocols.

Implementation

Extract HTTP Objects:

# File → Export Objects → HTTP
# Shows all HTTP file transfers:
# - Hostname
# - Content Type
# - Size
# - Filename

# Select files to save
# Click "Save" or "Save All"

# Filter exported objects:
# - Text filter box to search by filename
# - Content Type column to filter by MIME type

Extract Files from FTP:

# FTP uses separate control (port 21) and data connections
# Control connection shows commands:
ftp

# Find RETR (retrieve) or STOR (store) commands
# Note the filename being transferred

# Filter for FTP data:
ftp-data

# Follow TCP Stream of data connection
# Save raw data (Show and save data as: Raw)
# Save to file with appropriate extension

Extract Files from SMB/CIFS:

# File → Export Objects → SMB
# Shows files transferred via Windows file sharing

# Protocol hierarchy shows SMB traffic:
# Statistics → Protocol Hierarchy → SMB2/SMB

Verify Extracted Files:

# Calculate hash of extracted file
sha256sum extracted_file.bin

# Check file type (Linux/Mac)
file extracted_file.bin

# Check for malware (if suspicious)
# Upload to VirusTotal or scan with antivirus

# Examine file headers
hexdump -C extracted_file.bin | head -20

# Common file signatures:
# PDF: %PDF
# PNG: 89 50 4E 47
# ZIP: 50 4B 03 04
# JPEG: FF D8 FF

Database & Development

SQL Metadata Queries for Dynamic Schema Discovery

Problem

Applications often need to discover database schema information programmatically for dynamic form generation, data validation, or automated documentation. Hard-coding table and column names throughout application code creates maintenance overhead when schema changes occur. Without metadata queries, developers manually track column data types, constraints, and relationships, leading to errors when schema evolves. Applications cannot adapt to schema changes without code modifications. Dynamic report builders, data import tools, and admin interfaces require runtime access to table structures but lack mechanisms to query database metadata efficiently.

Solution

Use Oracle data dictionary views (USER_TAB_COLUMNS, ALL_TAB_COLUMNS, USER_CONSTRAINTS, USER_INDEXES) to query schema metadata programmatically. Build reusable PL/SQL functions that return table structures, column definitions, and constraint information. Create dynamic SQL generators that adapt to schema changes automatically. Implement column data type validation using metadata rather than hard-coded type checks. Use metadata queries to generate interactive reports, form fields, and API documentation. Cache frequently accessed metadata to minimize dictionary query overhead while refreshing when DDL changes occur.

Implementation

Query Table Columns:

-- Get all columns for a table with data types
SELECT 
    column_name,
    data_type,
    data_length,
    data_precision,
    data_scale,
    nullable,
    data_default
FROM user_tab_columns
WHERE table_name = 'MY_TABLE'
ORDER BY column_id;

-- Get columns across schemas
SELECT 
    owner,
    table_name,
    column_name,
    data_type
FROM all_tab_columns
WHERE table_name = 'EMPLOYEES'
  AND owner IN ('HR_SCHEMA', 'PAYROLL_SCHEMA');

Dynamic Column List Function:

CREATE OR REPLACE FUNCTION get_table_columns (
    p_table_name IN VARCHAR2,
    p_delimiter IN VARCHAR2 DEFAULT ','
) RETURN VARCHAR2
IS
    l_columns VARCHAR2(4000);
BEGIN
    SELECT LISTAGG(column_name, p_delimiter) 
           WITHIN GROUP (ORDER BY column_id)
    INTO l_columns
    FROM user_tab_columns
    WHERE table_name = UPPER(p_table_name);
    
    RETURN l_columns;
END get_table_columns;

-- Usage: Generate SELECT statement dynamically
DECLARE
    l_sql VARCHAR2(4000);
BEGIN
    l_sql := 'SELECT ' || get_table_columns('USERS') || 
             ' FROM users WHERE status = ''ACTIVE''';
    EXECUTE IMMEDIATE l_sql;
END;

Column Data Type Validation:

CREATE OR REPLACE FUNCTION validate_column_type (
    p_table_name IN VARCHAR2,
    p_column_name IN VARCHAR2,
    p_expected_type IN VARCHAR2
) RETURN BOOLEAN
IS
    l_actual_type VARCHAR2(106);
BEGIN
    SELECT data_type INTO l_actual_type
    FROM user_tab_columns
    WHERE table_name = UPPER(p_table_name)
      AND column_name = UPPER(p_column_name);
    
    RETURN UPPER(l_actual_type) = UPPER(p_expected_type);
EXCEPTION
    WHEN NO_DATA_FOUND THEN
        RETURN FALSE;
END validate_column_type;

-- Usage in dynamic form validation
IF NOT validate_column_type('RECORDS', 'START_DATE', 'DATE') THEN
    RAISE_APPLICATION_ERROR(-20001, 'Invalid column type for START_DATE');
END IF;

Query Primary Key Constraints:

-- Find primary key columns for a table
SELECT 
    acc.column_name,
    acc.position
FROM all_constraints ac
JOIN all_cons_columns acc ON ac.constraint_name = acc.constraint_name
WHERE ac.table_name = 'USER_OPERATIONS'
  AND ac.constraint_type = 'P'
ORDER BY acc.position;

Query Foreign Key Relationships:

-- Find all foreign keys referencing a table
SELECT 
    ac.constraint_name,
    ac.table_name as child_table,
    acc.column_name as child_column,
    ac_pk.table_name as parent_table,
    acc_pk.column_name as parent_column
FROM all_constraints ac
JOIN all_cons_columns acc ON ac.constraint_name = acc.constraint_name
JOIN all_constraints ac_pk ON ac.r_constraint_name = ac_pk.constraint_name
JOIN all_cons_columns acc_pk ON ac_pk.constraint_name = acc_pk.constraint_name
WHERE ac_pk.table_name = 'MAIN_RECORDS'
  AND ac.constraint_type = 'R';

Generate CREATE TABLE Statement:

CREATE OR REPLACE FUNCTION generate_create_table (
    p_table_name IN VARCHAR2
) RETURN CLOB
IS
    l_sql CLOB;
BEGIN
    l_sql := 'CREATE TABLE ' || p_table_name || ' (' || CHR(10);
    
    FOR col IN (
        SELECT 
            column_name,
            data_type,
            data_length,
            data_precision,
            data_scale,
            nullable,
            column_id
        FROM user_tab_columns
        WHERE table_name = UPPER(p_table_name)
        ORDER BY column_id
    ) LOOP
        IF col.column_id > 1 THEN
            l_sql := l_sql || ',' || CHR(10);
        END IF;
        
        l_sql := l_sql || '    ' || col.column_name || ' ';
        
        -- Add data type
        IF col.data_type IN ('VARCHAR2', 'CHAR') THEN
            l_sql := l_sql || col.data_type || '(' || col.data_length || ')';
        ELSIF col.data_type = 'NUMBER' AND col.data_precision IS NOT NULL THEN
            l_sql := l_sql || col.data_type || '(' || col.data_precision;
            IF col.data_scale IS NOT NULL AND col.data_scale > 0 THEN
                l_sql := l_sql || ',' || col.data_scale;
            END IF;
            l_sql := l_sql || ')';
        ELSE
            l_sql := l_sql || col.data_type;
        END IF;
        
        -- Add nullable constraint
        IF col.nullable = 'N' THEN
            l_sql := l_sql || ' NOT NULL';
        END IF;
    END LOOP;
    
    l_sql := l_sql || CHR(10) || ');';
    
    RETURN l_sql;
END generate_create_table;

Dynamic APEX LOV Query:

-- Create list of values dynamically from any table
CREATE OR REPLACE FUNCTION get_lov_query (
    p_table_name IN VARCHAR2,
    p_display_column IN VARCHAR2,
    p_return_column IN VARCHAR2
) RETURN VARCHAR2
IS
BEGIN
    RETURN 'SELECT ' || p_display_column || ' d, ' || 
           p_return_column || ' r FROM ' || p_table_name || 
           ' ORDER BY ' || p_display_column;
END get_lov_query;

-- Use in APEX LOV definition
-- Type: SQL Query
-- Query: BEGIN RETURN get_lov_query('DEPARTMENTS', 'DEPT_NAME', 'DEPT_ID'); END;

Metadata Cache Table:

CREATE TABLE metadata_cache (
    table_name VARCHAR2(128),
    column_name VARCHAR2(128),
    data_type VARCHAR2(106),
    is_nullable VARCHAR2(1),
    cached_date TIMESTAMP DEFAULT SYSTIMESTAMP,
    PRIMARY KEY (table_name, column_name)
);

-- Refresh procedure
CREATE OR REPLACE PROCEDURE refresh_metadata_cache
IS
BEGIN
    DELETE FROM metadata_cache;
    
    INSERT INTO metadata_cache (
        table_name, column_name, data_type, is_nullable
    )
    SELECT 
        table_name,
        column_name,
        data_type,
        nullable
    FROM user_tab_columns;
    
    COMMIT;
END refresh_metadata_cache;

Migration & DevOps Automation

Move Files Exceeding Path Length Limit (Version Control Migration)

Problem

GitHub and Windows impose a 260-character limit on full file paths. During version control system migrations, files with deeply nested directory structures or long filenames exceed this limit, causing push failures and blocking repository migration. Manual identification and reorganization of these files is time-consuming and error-prone across repositories containing thousands of files. Without automated path length detection, migration projects stall on edge cases discovered late in the process.

Solution

Create a PowerShell script that recursively scans the repository, identifies files exceeding the 260-character path limit, and moves them to a designated folder while preserving their original directory structure. This isolates problematic files for manual review and restructuring without blocking the main migration workflow. The script provides detailed logging of moved files including their original path lengths. After migration, relocated files can be restructured with shorter paths and reintegrated into the repository.

Implementation

# Move files with long paths while preserving directory structure
# GitHub/Windows limit: 260 characters for full path

param(
    [string]$SourcePath = ".",
    [string]$DestPath = ".\LongPaths",
    [int]$MaxPathLength = 260
)

# Get absolute source path for relative path calculation
$sourceFullPath = (Resolve-Path $SourcePath).Path

# Find and move files exceeding path limit
Get-ChildItem -Path $SourcePath -Recurse -File | Where-Object {
    $_.FullName.Length -gt $MaxPathLength
} | ForEach-Object {
    $file = $_
    $relativePath = $file.FullName.Substring($sourceFullPath.Length + 1)
    $destFile = Join-Path $DestPath $relativePath
    $destDir = Split-Path $destFile -Parent
    
    # Create destination directory structure
    if (-not (Test-Path $destDir)) {
        New-Item -ItemType Directory -Path $destDir -Force | Out-Null
    }
    
    Write-Host "Moving: $relativePath (Length: $($file.FullName.Length))"
    Move-Item -Path $file.FullName -Destination $destFile -Force
}

Write-Host "Migration complete. Long-path files moved to: $DestPath"

Usage:

# Run from repository root
.\Move-LongPaths.ps1

# Custom parameters
.\Move-LongPaths.ps1 -SourcePath "C:\Projects\MyApp" -DestPath "C:\LongPaths" -MaxPathLength 250

Remove Archive Extensions from Filenames

Problem

Legacy version control systems sometimes append archive suffixes to file revisions. When migrating to Git, these archive suffixes persist in filenames, creating confusion about which files are current versus historical. The archive suffix serves no purpose in Git's version control model and clutters the repository namespace. Manually renaming hundreds or thousands of files across multiple repositories is impractical and introduces human error risk.

Solution

Implement a PowerShell script that recursively identifies all files with archive suffixes in their names and removes these suffixes while preserving the rest of the filename and extension. The script operates on the file system before Git commits, ensuring clean filenames enter the repository from the initial import. Logging output shows each renamed file for audit purposes. The script handles edge cases including files with multiple hyphens and validates that removals don't create filename collisions.

Implementation

# Remove archive suffix from filenames
# Handles files like "document-arc.txt" -> "document.txt"

param(
    [string]$Path = ".",
    [string]$Suffix = "-arc",
    [switch]$WhatIf
)

Get-ChildItem -Path $Path -Recurse -File | Where-Object {
    $_.Name -match $Suffix
} | ForEach-Object {
    $file = $_
    $newName = $file.Name -replace $Suffix, ''
    $newPath = Join-Path $file.DirectoryName $newName
    
    # Check for naming collision
    if (Test-Path $newPath) {
        Write-Warning "Collision detected: $newPath already exists. Skipping $($file.Name)"
    } else {
        Write-Host "Renaming: $($file.Name) -> $newName"
        
        if (-not $WhatIf) {
            Rename-Item -Path $file.FullName -NewName $newName
        }
    }
}

if ($WhatIf) {
    Write-Host "`nDry run complete. Use without -WhatIf to apply changes."
}

Usage:

# Preview changes without modifying files
.\Remove-ArchiveSuffix.ps1 -WhatIf

# Execute renaming
.\Remove-ArchiveSuffix.ps1

# Target specific directory with custom suffix
.\Remove-ArchiveSuffix.ps1 -Path "C:\Projects\Legacy" -Suffix "-old"

Move Large Files for Git LFS Processing

Problem

GitHub repositories have a 100MB file size limit. Files exceeding this limit cause push failures and block repository migration. Large binary files (videos, datasets, compiled artifacts, installers) are common in legacy repositories but incompatible with standard Git workflows. Git Large File Storage (LFS) is required for files over 100MB, but identifying which files need LFS treatment across thousands of repository files is time-intensive. Manual inspection misses edge cases and delays migration timelines.

Solution

Create a PowerShell script that scans the repository recursively, identifies files exceeding 100MB, and moves them to a designated folder for Git LFS configuration or alternate storage strategies. The script preserves directory structure and provides detailed size reporting. After relocation, administrators can evaluate whether files should use Git LFS, remain in external storage, or be excluded from version control. This separates the large file problem from the main migration workflow, preventing 100MB+ files from blocking entire repository imports.

Implementation

# Move files larger than 100MB (GitHub limit)
# Preserves directory structure for Git LFS setup

param(
    [string]$SourcePath = ".",
    [string]$DestPath = ".\LargeFiles",
    [int]$MaxSizeMB = 100
)

$maxSizeBytes = $MaxSizeMB * 1MB
$sourceFullPath = (Resolve-Path $SourcePath).Path

Get-ChildItem -Path $SourcePath -Recurse -File | Where-Object {
    $_.Length -gt $maxSizeBytes
} | ForEach-Object {
    $file = $_
    $relativePath = $file.FullName.Substring($sourceFullPath.Length + 1)
    $destFile = Join-Path $DestPath $relativePath
    $destDir = Split-Path $destFile -Parent
    
    # Create destination directory structure
    if (-not (Test-Path $destDir)) {
        New-Item -ItemType Directory -Path $destDir -Force | Out-Null
    }
    
    $sizeMB = [math]::Round($file.Length / 1MB, 2)
    Write-Host "Moving: $relativePath (Size: $sizeMB MB)"
    Move-Item -Path $file.FullName -Destination $destFile -Force
}

Write-Host "`nLarge files moved to: $DestPath"
Write-Host "Configure Git LFS or alternate storage for these files."

Usage:

# Run from repository root
.\Move-LargeFiles.ps1

# Custom size threshold (e.g., 50MB)
.\Move-LargeFiles.ps1 -MaxSizeMB 50

# Custom paths
.\Move-LargeFiles.ps1 -SourcePath "C:\VCS-Export" -DestPath "C:\LFS-Staging"

Post-Migration Git LFS Setup:

# Install Git LFS
git lfs install

# Track large file types
git lfs track "*.iso"
git lfs track "*.mp4"
git lfs track "*.zip"

# Add .gitattributes and large files back to repository
git add .gitattributes
git add path/to/large/files/
git commit -m "Add large files via Git LFS"

GitHub Enterprise Pull Request Template Configuration

Problem

Development teams working with mixed technology stacks require different pull request templates for different types of changes. Standard code changes need basic testing checklists and change descriptions, while specialized application changes require validation covering page modifications, database objects, SQL security, session state management, and deployment procedures. Using a single generic template forces developers to delete irrelevant sections or ignore inapplicable checklist items, reducing template effectiveness and creating inconsistent PR documentation. GitHub Enterprise supports multiple PR templates through a dropdown selector, but this feature requires specific directory structure and may not function correctly in all GHE versions or configurations.

Solution

Implement a combined pull request template that includes conditional sections for both standard code changes and specialized application testing. Structure the template with clear section headers and instructions directing developers to delete the section they're not using. This approach works universally across GitHub Enterprise versions without requiring special directory structures or dropdown functionality. The combined template provides comprehensive checklists for both workflow types while maintaining simplicity—developers simply delete the irrelevant section before submitting their PR.

Implementation

Combined Template Structure:





---

# STANDARD PULL REQUEST


## Description


## Type of Change
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that causes existing functionality to not work as expected)
- [ ] Documentation update
- [ ] Performance improvement
- [ ] Code refactoring
- [ ] Dependency update


## Related Issues


## Changes Made
- 
- 


## Testing

### Test Configuration
- OS:
- Browser (if applicable):
- Database (if applicable):

### Test Cases
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing completed


## Checklist
- [ ] Code follows project style guidelines
- [ ] Self-review completed
- [ ] Comments added to complex code sections
- [ ] Documentation updated
- [ ] No new warnings generated
- [ ] Tests added/updated for changes
- [ ] All tests pass locally


---

# APPLICATION TESTING


## Application Details
**Application ID:** 
**Application Name:** 
**Target Environment:** 
**Version:** 


## Description


## Type of Change
- [ ] New page/region
- [ ] Page item modification
- [ ] Dynamic action change
- [ ] SQL/PLSQL code update
- [ ] Database object change (table, view, package, procedure)
- [ ] Authorization/authentication modification
- [ ] Bug fix
- [ ] Performance optimization
- [ ] Report/Grid modification
- [ ] Process/Computation change
- [ ] Validation/Error handling


## Pages Modified
| Page ID | Page Name | Changes |
|---------|-----------|---------|
|         |           |         |


## Database Objects Modified
- [ ] Tables
- [ ] Views
- [ ] Packages
- [ ] Procedures/Functions
- [ ] Triggers
- [ ] Sequences

### DDL/DML Scripts Required
- [ ] Yes (include migration script path/details)
- [ ] No

**Script Location:** 


## Security Impact
- [ ] Authorization scheme changes
- [ ] Session state modifications
- [ ] Authentication changes
- [ ] Page access control
- [ ] Item-level security


## Testing Performed

### Functional Testing
- [ ] All page items render correctly
- [ ] Dynamic actions fire as expected
- [ ] Validations trigger appropriately
- [ ] Processes execute successfully
- [ ] Branches navigate correctly
- [ ] Modal dialogs function properly
- [ ] Interactive Grid/Report operations work
- [ ] File uploads/downloads (if applicable)

### Data Testing
- [ ] Insert operations
- [ ] Update operations
- [ ] Delete operations
- [ ] Query performance acceptable
- [ ] Data validation rules enforced
- [ ] Foreign key constraints respected

### Cross-Browser Testing
- [ ] Chrome
- [ ] Firefox
- [ ] Edge
- [ ] Safari (if applicable)

### Error Handling
- [ ] Invalid input handled gracefully
- [ ] Error messages display correctly
- [ ] Exception handling tested
- [ ] Constraint violations caught


## SQL/PLSQL Code Review
- [ ] Bind variables used (no SQL injection risk)
- [ ] Proper exception handling
- [ ] No hard-coded values
- [ ] Efficient queries (explain plan reviewed if complex)
- [ ] Proper commit/rollback handling
- [ ] No SELECT * usage


## Performance Considerations


## Dependencies
- [ ] New plugins required
- [ ] External JavaScript libraries
- [ ] New database objects
- [ ] Supporting Objects installation needed
- [ ] Configuration changes required


## Rollback Plan


## Deployment Notes
- [ ] Run pre-deployment scripts
- [ ] Import application (replace/upgrade)
- [ ] Run post-deployment scripts
- [ ] Clear cache
- [ ] Test in target environment


## Checklist
- [ ] Code reviewed for security vulnerabilities
- [ ] Session state properly managed
- [ ] No exposed SQL injection vectors
- [ ] Authorization checks on all protected pages
- [ ] Error messages don't expose sensitive information
- [ ] All substitution strings properly escaped
- [ ] Region/item conditions verified
- [ ] Application settings documented


---

## Screenshots


## Additional Notes


## Reviewer Notes

File Location:

.github/
└── pull_request_template.md

Why Combined Template Works:

No special directory structure required (PULL_REQUEST_TEMPLATE/ folder unnecessary)
Compatible with all GitHub Enterprise versions
Works with GitHub Desktop workflow where developers may not have Git CLI installed
Simple developer experience: open PR, delete irrelevant section, fill out applicable checklist
Maintains comprehensive coverage for both standard and application-specific validation requirements

Alternative: Multiple Template Directory:

# Directory structure for dropdown selector
.github/
└── PULL_REQUEST_TEMPLATE/
    ├── standard.md
    └── application_testing.md

# This approach shows template dropdown in PR creation UI
# However, single root-level template (pull_request_template.md) takes precedence
# Dropdown feature requires GHE 2.19+ and may not render correctly in all cases
# If dropdown doesn't appear despite correct configuration, use combined template approach instead

Troubleshooting Template Selection:

Verify no pull_request_template.md file exists in .github/ root when using PULL_REQUEST_TEMPLATE/ directory
Check repository default branch—templates only load from default branch
Clear browser cache (Ctrl+F5) when testing template changes
Confirm GitHub Enterprise version supports multiple templates (2.19+)
Check organization-level .github repository isn't overriding repo-level templates

Password Recovery & Cryptography

Dictionary-Based Password Cracking with John the Ripper

Problem

Security assessments and forensic investigations often require recovering passwords from hash files to access encrypted systems, validate password policies, or understand security postures. Manual password guessing is impractical for strong passwords. Brute force attacks take exponential time as password complexity increases. Without efficient password recovery tools, analysts cannot assess whether passwords meet security requirements or recover access to critical systems during incident response. Password hashes from various systems (Unix shadow files, Windows SAM, application databases) require format-specific cracking approaches.

Solution

Use John the Ripper with dictionary-based attacks leveraging wordlists like rockyou.txt to efficiently test millions of common passwords against hash files. Dictionary attacks prioritize likely passwords based on real-world password usage patterns, dramatically reducing time to crack weak passwords compared to brute force. John the Ripper supports multiple hash formats and can auto-detect hash types. The tool can process single hash files or entire directories of hashes simultaneously. Results are cached, allowing recovery of previously cracked passwords without re-running attacks. This approach balances effectiveness against weak passwords with computational efficiency.

Implementation

Basic Dictionary Attack:

# Single hash file
john --wordlist=/usr/share/wordlists/rockyou.txt hash_file.txt

# Multiple hash files in directory
john --wordlist=/usr/share/wordlists/rockyou.txt /path/to/hash_folder/*

# Specific hash format
john --wordlist=/usr/share/wordlists/rockyou.txt --format=Raw-MD5 hash_file.txt

# Common hash formats:
# Raw-MD5, Raw-SHA1, Raw-SHA256, Raw-SHA512
# NT (Windows NTLM), LM (Windows LAN Manager)
# bcrypt, scrypt, PBKDF2
# descrypt (traditional Unix crypt)

View Cracked Passwords:

# Show all cracked passwords from previous runs
john --show hash_file.txt

# Show with specific format
john --show --format=Raw-MD5 hash_file.txt

# Output format: username:password:uid:gid:gecos:home:shell

Wordlist Locations:

# Kali Linux / Debian-based
/usr/share/wordlists/rockyou.txt
/usr/share/wordlists/rockyou.txt.gz  # Compressed version

# Extract compressed wordlist
gunzip /usr/share/wordlists/rockyou.txt.gz

# Other common wordlists
/usr/share/wordlists/metasploit/
/usr/share/seclists/Passwords/

Processing Multiple Files:

# Crack all hash files in directory
john --wordlist=/usr/share/wordlists/rockyou.txt hash1.txt hash2.txt hash3.txt

# Using wildcard
john --wordlist=/usr/share/wordlists/rockyou.txt /evidence/*.hash

# Recursive directory processing
find /path/to/hashes -type f -name "*.txt" -exec john --wordlist=/usr/share/wordlists/rockyou.txt {} \;

Performance Optimization:

# Use multiple cores
john --wordlist=/usr/share/wordlists/rockyou.txt --fork=4 hash_file.txt

# Session management for long-running cracks
john --wordlist=/usr/share/wordlists/rockyou.txt --session=my_session hash_file.txt

# Restore interrupted session
john --restore=my_session

# Show status of running session
john --status=my_session

Hash Format Detection:

# Let John detect hash format automatically
john --wordlist=/usr/share/wordlists/rockyou.txt hash_file.txt

# List supported formats
john --list=formats

# List subformats for specific type
john --list=subformats

# Identify hash type from sample
john --list=format-details | grep -i md5

Advanced Wordlist Techniques:

# Use rules to generate variations
john --wordlist=/usr/share/wordlists/rockyou.txt --rules hash_file.txt

# Combine multiple wordlists
cat wordlist1.txt wordlist2.txt > combined.txt
john --wordlist=combined.txt hash_file.txt

# Incremental mode (brute force with character sets)
john --incremental hash_file.txt

# Incremental with specific charset
john --incremental=Digits hash_file.txt  # Numbers only
john --incremental=Alpha hash_file.txt   # Letters only

Output and Reporting:

# Save cracked passwords to file
john --show hash_file.txt > cracked_passwords.txt

# Show only usernames and passwords
john --show --format=Raw-MD5 hash_file.txt | cut -d: -f1,2

# Count cracked vs uncracked
john --show hash_file.txt | wc -l  # Cracked count
grep -c ':' hash_file.txt           # Total hash count

Common Hash File Formats:

# Unix /etc/shadow format
username:$6$salt$hash:18000:0:99999:7:::

# Simple hash list (one per line)
5f4dcc3b5aa765d61d8327deb882cf99
098f6bcd4621d373cade4e832627b4f6

# Username:hash pairs
admin:5f4dcc3b5aa765d61d8327deb882cf99
user:098f6bcd4621d373cade4e832627b4f6

# NTLM hashes (Windows)
Administrator:500:aad3b435b51404eeaad3b435b51404ee:31d6cfe0d16ae931b73c59d7e0c089c0:::

Best Practices:

Always specify hash format when known to avoid misdetection and improve performance
Use session names for long-running cracks to enable pause/resume
Check john.pot file (~/.john/john.pot) for previously cracked passwords before re-running
Combine dictionary attacks with rules for better coverage of password variations
Monitor system resources—large wordlists consume significant RAM and CPU
Keep wordlists updated with recently leaked password databases for improved success rates

Query Languages & Data Analysis

Network Flow Analysis with Kusto Query Language (KQL)

Problem

Network flow logs contain vast amounts of connection data including source/destination IPs, ports, byte counts, and packet information. Security analysts need to identify top talkers, suspicious connections, unusual traffic patterns, and encoded data hidden in flow metadata. Manual analysis of thousands or millions of flow records is impractical. SQL-based tools lack the aggregation and time-series analysis capabilities needed for effective network forensics. Traditional packet analysis tools focus on individual packets rather than connection-level patterns. Without proper query capabilities, critical security insights remain buried in raw flow log data.

Solution

Use Kusto Query Language (KQL) to analyze network flow logs with powerful aggregation, filtering, and pattern matching capabilities. KQL excels at summarizing large datasets, identifying top values, and extracting encoded data from text fields. Import flow logs into Azure Data Explorer or use local Kusto tools for analysis. Write queries that aggregate traffic by source IP to find top talkers, filter by destination port to identify protocol usage, and decode Base64-encoded packet data to reveal hidden messages. KQL's pipe-based syntax enables building complex analysis workflows by chaining simple operations. Results can be visualized directly or exported for further analysis.

Implementation

Setup and Data Import:

// Azure Data Explorer (online)
// 1. Navigate to https://dataexplorer.azure.com/
// 2. Create free cluster or use existing
// 3. Create database
// 4. Import data: Home → Ingest data
// 5. Select file format (CSV, JSON, etc.)

// Kusto.Explorer (desktop tool)
// 1. Download from https://aka.ms/ke
// 2. Connect to cluster
// 3. Right-click database → Import data
// 4. Select source file

Query Top Talker by Source IP:

// Find source IP with highest traffic volume
NetworkFlowTable
| summarize TotalBytes = sum(BytesSrcToDest + BytesDestToSrc) by SrcPublicIps
| top 1 by TotalBytes desc
| project SrcPublicIps

// Alternative: Count flows instead of bytes
NetworkFlowTable
| summarize FlowCount = count() by SrcPublicIps
| top 1 by FlowCount desc
| project SrcPublicIps

// Top 10 talkers with traffic statistics
NetworkFlowTable
| summarize 
    TotalBytes = sum(BytesSrcToDest + BytesDestToSrc),
    FlowCount = count(),
    UniqueDestinations = dcount(DestIp)
  by SrcPublicIps
| top 10 by TotalBytes desc

Identify IPs Using Specific Ports:

// Find all source IPs using ports 80 or 443
NetworkFlowTable
| where DestPort in (80, 443)
| distinct SrcPublicIps

// Find IP using both ports
NetworkFlowTable
| where DestPort in (80, 443)
| summarize Ports = make_set(DestPort) by SrcPublicIps
| where array_length(Ports) == 2
| project SrcPublicIps

// Traffic breakdown by port
NetworkFlowTable
| where DestPort in (80, 443)
| summarize FlowCount = count(), TotalBytes = sum(BytesSrcToDest) by SrcPublicIps, DestPort
| order by TotalBytes desc

Decode Base64 Packet Data:

// Decode packet_data field for specific source IP
NetworkFlowTable
| where SrcPublicIps == "top_talker_ip_from_query"
| where isnotempty(packet_data)
| project DecodedData = base64_decode_tostring(packet_data)
| distinct DecodedData

// Search for encoded data containing specific pattern
NetworkFlowTable
| where isnotempty(packet_data)
| extend DecodedData = base64_decode_tostring(packet_data)
| where DecodedData contains "flag" or DecodedData contains "password"
| project SrcPublicIps, DestIp, DecodedData

// Decode and analyze packet sizes
NetworkFlowTable
| where SrcPublicIps == "suspicious_ip"
| where isnotempty(packet_data)
| extend DecodedData = base64_decode_tostring(packet_data)
| project SrcPublicIps, PacketSize = strlen(DecodedData), DecodedData

Advanced Analysis Patterns:

// Identify unusual port usage
NetworkFlowTable
| summarize ConnectionCount = count() by DestPort
| where ConnectionCount < 10  // Rarely used ports
| order by ConnectionCount desc

// Find connections to specific subnets
NetworkFlowTable
| where DestIp startswith "10.0." or DestIp startswith "192.168."
| summarize InternalConnections = count() by SrcPublicIps
| top 20 by InternalConnections desc

// Time-based analysis
NetworkFlowTable
| where FlowStartTime between (datetime(2025-01-01) .. datetime(2025-01-31))
| summarize HourlyFlows = count() by bin(FlowStartTime, 1h)
| render timechart

// Protocol distribution
NetworkFlowTable
| summarize FlowCount = count() by L4Protocol
| extend Percentage = FlowCount * 100.0 / toscalar(NetworkFlowTable | count)
| order by FlowCount desc

Data Filtering and Aggregation:

// Combine multiple filters
NetworkFlowTable
| where BytesSrcToDest > 1000000  // Large uploads
| where DestPort == 443           // HTTPS
| where FlowDirection == "O"      // Outbound
| summarize 
    TotalUploaded = sum(BytesSrcToDest),
    FlowCount = count()
  by SrcPublicIps
| order by TotalUploaded desc

// Group by multiple dimensions
NetworkFlowTable
| summarize 
    TotalBytes = sum(BytesSrcToDest + BytesDestToSrc),
    AvgPackets = avg(PacketsSrcToDest + PacketsDestToSrc)
  by SrcPublicIps, DestRegion, L4Protocol
| where TotalBytes > 10000000

Common KQL Operators:

// Filtering
| where DestPort == 80
| where SrcPublicIps in ("1.2.3.4", "5.6.7.8")
| where BytesSrcToDest > 1000
| where FlowStartTime > ago(7d)

// Aggregation
| summarize count() by SrcPublicIps
| summarize sum(TotalBytes), avg(PacketCount) by Protocol
| summarize make_list(DestIp) by SrcPublicIps

// Projection
| project SrcPublicIps, DestIp, TotalBytes
| project-away InternalField1, InternalField2

// Sorting
| sort by TotalBytes desc
| top 10 by FlowCount
| order by SrcPublicIps asc

// Distinct values
| distinct SrcPublicIps
| distinct DestIp, DestPort

// String operations
| where SrcPublicIps contains "192.168"
| where DestIp startswith "10."
| where packet_data matches regex "[A-Za-z0-9+/=]+"

// Type conversions
| extend BytesMB = BytesSrcToDest / 1048576.0
| extend FlowDuration = FlowEndTime - FlowStartTime
| extend DecodedPacket = base64_decode_tostring(packet_data)

Exporting Results:

// Export to CSV
NetworkFlowTable
| where DestPort == 443
| summarize TotalBytes = sum(BytesSrcToDest) by SrcPublicIps
| order by TotalBytes desc
// Click "Export to CSV" in Azure Data Explorer UI

// Save results to variable for reuse
let TopTalkers = NetworkFlowTable
    | summarize TotalBytes = sum(BytesSrcToDest + BytesDestToSrc) by SrcPublicIps
    | top 100 by TotalBytes desc;
TopTalkers
| join kind=inner (
    NetworkFlowTable
    | where DestPort in (80, 443)
  ) on SrcPublicIps
| summarize WebTraffic = sum(BytesSrcToDest) by SrcPublicIps

Best Practices:

Always filter data early in the query pipeline to reduce processing overhead
Use specific time ranges when working with large datasets to improve performance
Test queries on small data samples before running on full datasets
Use let statements to define reusable query components and intermediate results
Leverage built-in functions (make_set, make_list, dcount) for efficient aggregation
Comment complex queries to document logic and assumptions
Export results to CSV for further analysis in spreadsheet tools or Python

Common Use Cases:

// Data exfiltration detection
NetworkFlowTable
| where BytesSrcToDest > 10000000  // Large outbound transfers
| where DestPort not in (80, 443)   // Non-standard ports
| summarize TotalExfiltrated = sum(BytesSrcToDest) by SrcPublicIps, DestIp, DestPort
| order by TotalExfiltrated desc

// Beacon detection (regular interval connections)
NetworkFlowTable
| where SrcPublicIps == "suspicious_ip"
| summarize FlowTimes = make_list(FlowStartTime) by DestIp
| extend IntervalSeconds = array_sort_asc(FlowTimes)
// Analyze intervals for regularity

// Port scanning identification
NetworkFlowTable
| summarize UniqueDestPorts = dcount(DestPort) by SrcPublicIps
| where UniqueDestPorts > 100  // Contacted many different ports
| order by UniqueDestPorts desc

Data Filtering and Aggregation:

// Combine multiple filters
NetworkFlowTable
| where BytesSrcToDest > 1000000  // Large uploads
| where DestPort == 443           // HTTPS
| where FlowDirection == "O"      // Outbound
| summarize 
    TotalUploaded = sum(BytesSrcToDest),
    FlowCount = count()
  by SrcPublicIps
| order by TotalUploaded desc

// Group by multiple dimensions
NetworkFlowTable
| summarize 
    TotalBytes = sum(BytesSrcToDest + BytesDestToSrc),
    AvgPackets = avg(PacketsSrcToDest + PacketsDestToSrc)
  by SrcPublicIps, DestRegion, L4Protocol
| where TotalBytes > 10000000

Common KQL Operators:

// Filtering
| where DestPort == 80
| where SrcPublicIps in ("1.2.3.4", "5.6.7.8")
| where BytesSrcToDest > 1000
| where FlowStartTime > ago(7d)

// Aggregation
| summarize count() by SrcPublicIps
| summarize sum(TotalBytes), avg(PacketCount) by Protocol
| summarize make_list(DestIp) by SrcPublicIps

// Projection
| project SrcPublicIps, DestIp, TotalBytes
| project-away InternalField1, InternalField2

// Sorting
| sort by TotalBytes desc
| top 10 by FlowCount
| order by SrcPublicIps asc

// Distinct values
| distinct SrcPublicIps
| distinct DestIp, DestPort

// String operations
| where SrcPublicIps contains "192.168"
| where DestIp startswith "10."
| where packet_data matches regex "[A-Za-z0-9+/=]+"

// Type conversions
| extend BytesMB = BytesSrcToDest / 1048576.0
| extend FlowDuration = FlowEndTime - FlowStartTime
| extend DecodedPacket = base64_decode_tostring(packet_data)

Exporting Results:

// Export to CSV
NetworkFlowTable
| where DestPort == 443
| summarize TotalBytes = sum(BytesSrcToDest) by SrcPublicIps
| order by TotalBytes desc
// Click "Export to CSV" in Azure Data Explorer UI

// Save results to variable for reuse
let TopTalkers = NetworkFlowTable
    | summarize TotalBytes = sum(BytesSrcToDest + BytesDestToSrc) by SrcPublicIps
    | top 100 by TotalBytes desc;
TopTalkers
| join kind=inner (
    NetworkFlowTable
    | where DestPort in (80, 443)
  ) on SrcPublicIps
| summarize WebTraffic = sum(BytesSrcToDest) by SrcPublicIps

Best Practices:

Always filter data early in the query pipeline to reduce processing overhead
Use specific time ranges when working with large datasets to improve performance
Test queries on small data samples before running on full datasets
Use let statements to define reusable query components and intermediate results
Leverage built-in functions (make_set, make_list, dcount) for efficient aggregation
Comment complex queries to document logic and assumptions
Export results to CSV for further analysis in spreadsheet tools or Python

Common Use Cases:

// Data exfiltration detection
NetworkFlowTable
| where BytesSrcToDest > 10000000  // Large outbound transfers
| where DestPort not in (80, 443)   // Non-standard ports
| summarize TotalExfiltrated = sum(BytesSrcToDest) by SrcPublicIps, DestIp, DestPort
| order by TotalExfiltrated desc

// Beacon detection (regular interval connections)
NetworkFlowTable
| where SrcPublicIps == "suspicious_ip"
| summarize FlowTimes = make_list(FlowStartTime) by DestIp
| extend IntervalSeconds = array_sort_asc(FlowTimes)
// Analyze intervals for regularity

// Port scanning identification
NetworkFlowTable
| summarize UniqueDestPorts = dcount(DestPort) by SrcPublicIps
| where UniqueDestPorts > 100  // Contacted many different ports
| order by UniqueDestPorts desc