What regulatory sources does CTWiseAPI cover?

CTWiseAPI provides access to FDA Guidance Documents, ICH Guidelines (International Council for Harmonisation), EMA Regulations (European Medicines Agency), and WHO Standards (World Health Organization) for clinical trial compliance.

How does CTWiseAPI semantic search work?

CTWiseAPI uses AI-powered semantic search with AWS Bedrock Titan embeddings to understand the meaning of your regulatory queries, not just keywords. This enables natural language questions like 'informed consent requirements for Phase 1 trials' to return highly relevant results.

What are evidence chains in CTWiseAPI?

Evidence chains provide complete traceability from search results back to authoritative source documents, including source authority, document ID, section references, and confidence scores - essential for compliance audits and regulatory submissions.

Event Taxonomy

CTWise Event Taxonomy is a curated classification system that automatically maps manufacturing events to regulatory requirements with evidence-backed confidence scores.

What is the Event Taxonomy?

The Event Taxonomy is a structured knowledge base of 39 quality event types organized across 10+ categories, designed to classify manufacturing events and map them to relevant CFR regulations and ICH guidelines.

Key Facts

Attribute	Detail
Event Types	39 curated manufacturing event types
Categories	10+ compliance domains (environmental, documentation, equipment, quality systems, etc.)
Product Types	Drug, Food, Device, Supplement, API
CFR Mappings	Product-type-specific regulatory mappings
Confidence Scoring	0.00-1.00 scale with 0.60 minimum threshold
Scoring Algorithm	Hybrid keyword matching + Bedrock Titan v2 embeddings with Platt sigmoid calibration
Compound Observations	Top-N classification for multi-issue observations (up to 5 distinct event types)
Format	JSONL stored in S3 with pointer versioning
Current Version	v3.0 (39 event types)

Event Type Structure

Each event type in the taxonomy includes:

Event ID -- unique identifier (e.g., "pest_control")
Category -- compliance domain (e.g., "environmental_controls")
Severity -- risk level (critical, high, medium, low)
Keywords -- classification terms (multi-word phrases, domain terms)
CFR Mappings -- product-type-specific regulation references
ICH Guidelines -- international guideline cross-references
Confidence Weights -- TF-IDF rarity scores for distinctive terms

Why It Matters

The Problem

Manufacturing organizations face these challenges when classifying quality events:

Manual classification -- Quality teams spend hours categorizing events, slowing response time
Inconsistent classification -- Different analysts classify the same event type differently
Missing regulatory context -- Events are logged without connecting them to specific CFR requirements
No confidence measurement -- Classification decisions lack quantifiable evidence scores
Product-type confusion -- Same event maps to different regulations for drugs vs. food vs. devices

The Cost of Misclassification

Issue	Typical Impact
Delayed CAPA initiation	Regulatory observation escalation
Wrong CFR cited in response	FDA Form 483 follow-up citation
Missed trend analysis	Repeated violations, Warning Letter risk
Incomplete investigation	OAI classification at next inspection
Product-type mismatch	Citing 21 CFR 211 (drug) for food facility event

The Solution

Event Taxonomy provides:

Automated classification -- Classify events in under 500ms via API
Standardized categories -- Consistent event types across all facilities
Evidence-backed scoring -- Confidence scores with transparent methodology
Regulatory mapping -- Direct connection to applicable CFR sections and ICH guidelines
Product-type awareness -- Correct regulation mapping for drug/food/device/supplement/API

How It Works -- Hybrid Entity Resolution Algorithm

The Event Taxonomy uses a hybrid 6-step entity resolution algorithm combining keyword matching with neural embeddings to classify manufacturing events:

Classification Flow

Input Text → Extract Keywords → Infer Product Type → Keyword Score → Embedding Score → Hybrid Merge → Apply Threshold → Return Match

Step-by-Step Process

Step 1: Extract Keywords

Remove 50+ domain stopwords ("the", "a", "found", "observed", etc.)
Extract multi-word phrases (bigrams, trigrams)
Preserve technical terms ("pest control", "batch record", "data integrity")
Normalize text (lowercase, punctuation removal)
Multi-word plural normalization ("audit trails" matches "audit trail", "media fills" matches "media fill")

Step 2: Infer Product Type

Detect product type from text keywords:

Product Type	Detection Keywords
drug	pharmaceutical, tablet, capsule, injectable, API
food	food, beverage, HACCP, allergen, pathogen
supplement	dietary supplement, vitamin, herbal
api	active pharmaceutical ingredient, bulk drug
device	medical device, implant, diagnostic, 510(k)

Default: drug (if no keywords match)

Step 3: Keyword Score (Fast Path)

For each of the 39 event types, calculate a keyword-based composite score:

Scoring Formula:

base_score = (0.4 x coverage_score) + (0.6 x absolute_score)
keyword_score = base_score x (1.0 + rarity_boost x 0.4)

Where:

coverage_score = ratio of matched keywords to total event keywords
absolute_score = count of matched keywords / max possible matches
rarity_boost = TF-IDF weight for distinctive terms (e.g., "spider" = high rarity)
Multi-word fuzzy matching supports morphological variants (prefix-based, min 5-char prefix, 60% length ratio)

Step 4: Embedding Score (Semantic Path)

For ambiguous or low-keyword-match inputs, compute a neural embedding similarity score:

Embedding Pipeline:

input_text → Amazon Bedrock Titan Text Embedding v2 (1024-dim) → cosine similarity vs taxonomy embeddings

Platt Sigmoid Calibration:

calibrated_confidence = 1 / (1 + exp(-(A x cosine_sim + B)))

Where A=9.44, B=-3.03 (fitted from labeled calibration data), blended with keyword score using alpha=0.3.

Hybrid Merge:

final_score = (alpha x keyword_score) + ((1 - alpha) x embedding_score)

This hybrid approach ensures high-confidence keyword matches are preserved while allowing semantic understanding for novel phrasings.

Step 5: Apply Threshold

Minimum confidence: 0.60
Events scoring below 0.60 are rejected
Top-scoring event above threshold is selected
If no events exceed threshold, return "unclassified"

Step 6: Return Match

Return classified event with:

Event type ID and category
Severity level
Product-type-specific CFR mappings
ICH guideline references
Confidence score
Source provenance (taxonomy version, algorithm version)

Compound Observation Support (Top-N Classification)

Real-world 483 observations frequently describe multiple issues in a single sentence. The classifier supports top-N classification to capture all relevant event types:

Request:

{
  "event": "The firm's QU failed to ensure CGMP compliance, failed to ensure adequate investigations, and failed to establish adequate systems for document control",
  "product_type": "drug",
  "top_n": 3
}

Response includes:

Primary classification: Highest-scoring event type (e.g., quality_unit_failure)
Secondary classifications: Up to N-1 additional distinct event types above the confidence threshold, each with their own confidence score, category, severity, and applicable CFR sections

Top-N Rules:

Each returned classification is a distinct event type (no duplicates)
All secondary classifications must meet the 0.60 minimum confidence threshold
Maximum of 5 classifications per request (top_n range: 1-5)
Secondary classifications include applicable_cfr_sections for immediate regulatory context

Event Categories

The taxonomy organizes 39 event types across 10+ categories:

Category	Event Types	Example Events
environmental_controls	3	pest_control, temperature_excursion, environmental_monitoring_excursion
documentation	3	data_integrity_failure, batch_record_incomplete, sop_deviation
equipment	3	equipment_not_calibrated, equipment_maintenance, equipment_cleaning
manufacturing	2	cross_contamination, in_process_control
personnel	2	operator_training_gap, personnel_hygiene
labeling	3	label_mix_up, packaging_material, label_control
laboratory	5	laboratory_testing_failure, microbial_contamination, stability_program, laboratory_controls_deficiency, release_testing_failure
quality_systems	3	capa_failure, investigation_deficiency, quality_unit_failure
quality_management	4	complaint_handling, deviation_management, change_control, change_control_failure
food_safety	2	haccp_plan, allergen_control
validation	2	process_validation, validation_failure
medical_device	2	design_control, dhr_incomplete
facilities	1	facility_design
utilities	1	water_system_failure
supply_chain	1	supplier_qualification_gap
stability	1	stability_testing_gap
complaints	1	complaint_handling_deficiency

Product Type Coverage

CFR mappings adapt to product type, ensuring correct regulation citation:

Example: Pest Control Event

Product Type	CFR Mapping	Regulation Title
drug	21 CFR 211.56	Buildings and Facilities -- Sanitation
food	21 CFR 117.35	Sanitary Operations
supplement	21 CFR 111.20	What sanitation requirements apply to your physical plant and grounds?
api	21 CFR 211.56	Buildings and Facilities -- Sanitation
device	21 CFR 820.70	Production and Process Controls

Example: Calibration Failure Event

Product Type	CFR Mapping	Regulation Title
drug	21 CFR 211.68	Automatic, Mechanical, and Electronic Equipment
food	21 CFR 117.160	Calibration of Process Monitoring and Control Instruments
supplement	21 CFR 111.160	What requirements apply to laboratory methods, facilities, and controls?
api	21 CFR 211.68	Automatic, Mechanical, and Electronic Equipment
device	21 CFR 820.72	Inspection, Measuring, and Test Equipment

Example Classification Flow

Here's a complete example showing all 5 steps:

Input Event

"Spider found in manufacturing area during routine inspection"

Step 1: Keywords Extracted

{
  "extracted_keywords": ["spider", "manufacturing", "area", "routine", "inspection"],
  "stopwords_removed": ["found", "in", "during"],
  "multi_word_phrases": ["manufacturing area", "routine inspection"]
}

Step 2: Product Type Inferred

{
  "product_type": "drug",
  "reasoning": "No specific product keywords detected, using default",
  "confidence": 0.50
}

Step 3: Hybrid Scoring

{
  "scored_events": [
    {
      "event_type": "pest_control",
      "category": "environmental_controls",
      "keyword_score": 0.85,
      "embedding_score": 0.99,
      "hybrid_score": 0.97,
      "matched_keywords": ["spider", "pest", "manufacturing", "area"],
      "rarity_boost": 0.15,
      "scoring_method": "hybrid_keyword_embedding"
    },
    {
      "event_type": "environmental_monitoring_excursion",
      "category": "environmental_controls",
      "keyword_score": 0.20,
      "embedding_score": 0.52,
      "hybrid_score": 0.42,
      "matched_keywords": ["area"],
      "rarity_boost": 0.02,
      "scoring_method": "hybrid_keyword_embedding"
    }
  ]
}

Step 4: Threshold Check

{
  "threshold": 0.60,
  "passed_events": [
    {
      "event_type": "pest_control",
      "score": 0.97,
      "status": "PASS"
    }
  ],
  "rejected_events": [
    {
      "event_type": "contamination",
      "score": 0.45,
      "status": "FAIL (below threshold)"
    }
  ]
}

Step 5: Return Match

{
  "event_type": "pest_control",
  "category": "environmental_controls",
  "severity": "critical",
  "confidence": 0.97,
  "cfr_mappings": [
    {
      "cfr": "21 CFR 211.56",
      "title": "Buildings and Facilities -- Sanitation",
      "product_type": "drug"
    }
  ],
  "ich_mappings": [
    {
      "guideline": "ICH Q7",
      "section": "3.1",
      "title": "Buildings and Facilities"
    }
  ],
  "matched_keywords": ["spider", "pest", "manufacturing", "area"],
  "algorithm_version": "v3.0",
  "scoring_method": "hybrid_keyword_embedding"
}

APIs That Use Event Taxonomy

The Event Taxonomy powers multiple CTWise intelligence endpoints:

1. Event Classification API

Classify a single event description into a taxonomy event type:

API: POST /v1/kg/classify

curl -X POST https://api.ctwise.ai/v1/kg/classify \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "event": "Spider found in manufacturing area",
    "product_type": "drug"
  }'

Response:

{
  "event_classification": {
    "event_type": "pest_control",
    "event_category": "environmental_controls",
    "severity": "critical",
    "confidence": 0.97
  },
  "applicable_cfr_sections": ["21 CFR 211.56"]
}

Compound Observation (Top-N) Example

For observations describing multiple issues, request up to 5 classifications:

curl -X POST https://api.ctwise.ai/v1/kg/classify \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "event": "The QU failed to ensure CGMP compliance, failed to ensure adequate investigations, and failed to establish adequate systems for document control",
    "product_type": "drug",
    "top_n": 3
  }'

Response:

{
  "event_classification": {
    "event_type": "quality_unit_failure",
    "event_category": "quality_systems",
    "severity": "critical",
    "confidence": 0.95
  },
  "applicable_cfr_sections": ["21 CFR 211.22"],
  "top_n_classifications": [
    {
      "event_type": "investigation_deficiency",
      "event_category": "quality_systems",
      "severity": "critical",
      "confidence": 0.88,
      "applicable_cfr_sections": ["21 CFR 211.192"]
    },
    {
      "event_type": "change_control_failure",
      "event_category": "quality_management",
      "severity": "major",
      "confidence": 0.72,
      "applicable_cfr_sections": ["21 CFR 211.100"]
    }
  ]
}

2. Full Investigation API

Classify an event and get complete regulatory context with similar 483 observations:

API: POST /v1/intelligence/investigate

curl -X POST https://api.ctwise.ai/v1/intelligence/investigate \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "event_text": "Temperature excursion in cold storage",
    "product_type": "drug",
    "include_similar_observations": true,
    "include_regulatory_text": true
  }'

Response includes:

Event classification with confidence score
Product-type-specific CFR mappings
ICH guideline cross-references
Similar 483 observations from FDA inspections
Full eCFR regulation text
Risk assessment with evidence chain

Analyze trending event types across time periods:

API: GET /v1/analytics/trends

curl -X GET "https://api.ctwise.ai/v1/analytics/trends?event_type=pest_control&period=6m" \
  -H "X-Api-Key: YOUR_API_KEY"

Response:

{
  "event_type": "pest_control",
  "category": "environmental_controls",
  "trend_data": [
    {
      "month": "2026-01",
      "occurrence_count": 23,
      "facilities_affected": 18,
      "avg_confidence": 0.94
    },
    {
      "month": "2026-02",
      "occurrence_count": 31,
      "facilities_affected": 24,
      "avg_confidence": 0.96
    }
  ],
  "trend_direction": "increasing",
  "change_percentage": 34.8
}

Versioning & Updates

Storage Format

Event Taxonomy uses JSONL (JSON Lines) format stored in Amazon S3:

s3://ctwise-data-lake-{env}/483-intelligence/datasets/event-taxonomy/
├── current.json              # Pointer to active version
├── v3/
│   └── event-taxonomy-v3.0.jsonl  # 39 event types
├── v2/
│   └── event-taxonomy-v2.0.jsonl  # 30 event types (archived)
└── v1/
    └── event-taxonomy.jsonl       # 24 event types (archived)

Pointer Versioning

current.json contains a version pointer:

{
  "version": "v3.0",
  "released": "2026-03-13",
  "event_count": 39,
  "s3_path": "s3://ctwise-data-lake-{env}/483-intelligence/datasets/event-taxonomy/v3/event-taxonomy-v3.0.jsonl"
}

Version History

Version	Released	Event Types	Key Changes
v3.0	2026-03-13	39	Added 9 event types (quality_systems, supply_chain categories); hybrid keyword+embedding scoring; Platt sigmoid calibration; top-N classification for compound observations; multi-word plural keyword matching; enriched keywords for data_integrity, env_monitoring, cross_contamination, operator_training
v2.0	2025-12-01	30	Added food safety, medical device, utilities categories; enhanced scoring algorithm
v1.0	2024-06-15	24	Initial release with core GMP event types

Update Process

Quarterly Review -- CTWise team reviews FDA inspection trends
Event Type Evaluation -- Identify emerging event patterns
Validation -- Test new event types against 483 observation corpus
Release -- Deploy new version with backward-compatible versioning
API Transparency -- API responses include algorithm_version field

Relationship to 483 Intelligence

The Event Taxonomy enables automated classification of 483 observations, connecting raw inspection findings to specific regulatory requirements.

How They Work Together

Capability	483 Intelligence	Event Taxonomy
What it answers	"What violations did FDA cite?"	"What type of event is this?"
Data source	FDA inspection observations	Curated event classification system
Output	483 citations with CFR references	Event type with confidence score
Use together	Search for similar 483 observations...	...based on classified event type

Example: Combined Workflow

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.ctwise.ai/v1"
headers = {"X-Api-Key": API_KEY, "Content-Type": "application/json"}

# Step 1: Classify your manufacturing event
classification = requests.post(f"{BASE_URL}/kg/classify",
    headers=headers,
    json={
        "event": "HPLC system not calibrated for 6 months",
        "product_type": "drug"
    }
).json()
# Returns: event_type=equipment_not_calibrated, cfr_sections=["21 CFR 211.68"]

# Step 2: Find similar 483 observations
observations = requests.post(f"{BASE_URL}/483/observations/search",
    headers=headers,
    json={
        "query": "calibration failure HPLC",
        "filters": {
            "cfr": "21 CFR 211.68"
        },
        "top_k": 20
    }
).json()
# Returns: 47 matching 483 citations with similarity scores

# Step 3: Get full regulatory text
regulation = requests.get(f"{BASE_URL}/kg/regulations/21%20CFR%20211.68",
    headers=headers
).json()
# Returns: Full eCFR text, ICH cross-references, enforcement statistics

Integration Benefits

Automated root cause analysis -- Connect your event to similar FDA observations
Evidence-backed CAPA -- Reference specific 483 citations in corrective actions
Predictive compliance -- See which events frequently lead to OAI classifications
Regulatory trend awareness -- Track if your event type is trending in inspections

Getting Started

Ready to use the Event Taxonomy for automated event classification?

KG Intelligence Overview -- Understand Knowledge Graph capabilities
Event Classification API -- Classify events with confidence scores
Full Investigation API -- Get complete regulatory context
Trending Analysis API -- Track event type trends over time
483 Quickstart Guide -- Search similar 483 observations
API Reference -- Complete endpoint documentation

What is the Event Taxonomy?​

Key Facts​

Event Type Structure​

Why It Matters​

The Problem​

The Cost of Misclassification​

The Solution​

How It Works -- Hybrid Entity Resolution Algorithm​

Classification Flow​

Step-by-Step Process​

Step 1: Extract Keywords​

Step 2: Infer Product Type​

Step 3: Keyword Score (Fast Path)​

Step 4: Embedding Score (Semantic Path)​

Step 5: Apply Threshold​

Step 6: Return Match​

Compound Observation Support (Top-N Classification)​

Event Categories​

Product Type Coverage​

Example: Pest Control Event​

Example: Calibration Failure Event​

Example Classification Flow​

Input Event​

Step 1: Keywords Extracted​

Step 2: Product Type Inferred​

Step 3: Hybrid Scoring​

Step 4: Threshold Check​

Step 5: Return Match​

APIs That Use Event Taxonomy​

1. Event Classification API​

Compound Observation (Top-N) Example​

2. Full Investigation API​

3. Trending Event Types API​

Versioning & Updates​

Storage Format​

Pointer Versioning​

Version History​

Update Process​

Relationship to 483 Intelligence​

How They Work Together​

Example: Combined Workflow​

Integration Benefits​

Getting Started​

What is the Event Taxonomy?

Key Facts

Event Type Structure

Why It Matters

The Problem

The Cost of Misclassification

The Solution

How It Works -- Hybrid Entity Resolution Algorithm

Classification Flow

Step-by-Step Process

Step 1: Extract Keywords

Step 2: Infer Product Type

Step 3: Keyword Score (Fast Path)

Step 4: Embedding Score (Semantic Path)

Step 5: Apply Threshold

Step 6: Return Match

Compound Observation Support (Top-N Classification)

Event Categories

Product Type Coverage

Example: Pest Control Event

Example: Calibration Failure Event

Example Classification Flow

Input Event

Step 1: Keywords Extracted

Step 2: Product Type Inferred

Step 3: Hybrid Scoring

Step 4: Threshold Check

Step 5: Return Match

APIs That Use Event Taxonomy

1. Event Classification API

Compound Observation (Top-N) Example

2. Full Investigation API

3. Trending Event Types API

Versioning & Updates

Storage Format

Pointer Versioning

Version History

Update Process

Relationship to 483 Intelligence

How They Work Together

Example: Combined Workflow

Integration Benefits

Getting Started