What regulatory sources does CTWiseAPI cover?

CTWiseAPI provides access to FDA Guidance Documents, ICH Guidelines (International Council for Harmonisation), EMA Regulations (European Medicines Agency), and WHO Standards (World Health Organization) for clinical trial compliance.

How does CTWiseAPI semantic search work?

CTWiseAPI uses AI-powered semantic search with AWS Bedrock Titan embeddings to understand the meaning of your regulatory queries, not just keywords. This enables natural language questions like 'informed consent requirements for Phase 1 trials' to return highly relevant results.

What are evidence chains in CTWiseAPI?

Evidence chains provide complete traceability from search results back to authoritative source documents, including source authority, document ID, section references, and confidence scores - essential for compliance audits and regulatory submissions.

Semantic Search

CTWise API uses AI-powered semantic search to understand the meaning of your queries, not just keywords.

Overview

Traditional regulatory databases require exact keyword matches. CTWise uses Amazon Bedrock Titan Text Embeddings v2 and AWS S3 Vectors to understand what you're actually looking for.

The Problem with Keyword Search

Query: "informed consent pediatric"

Keyword Result: Only documents containing BOTH exact words
Missed: "assent procedures for minors", "parental permission requirements"

The Semantic Search Advantage

Query: "What are the requirements for informed consent in pediatric trials?"

Semantic Result:
1. FDA-INFORMED-CONSENT-2024 (score: 0.56) - Informed consent guidance
2. ICH-E11(R1) (score: 0.50) - Pediatric population guidance
3. FDA-PEDIATRIC-2023 (score: 0.44) - Pediatric study plans

Why: AI understands the MEANING relates to consent + children + trials

How It Works

1. Query Embedding

Your natural language query is converted to a 1024-dimensional vector using Amazon Bedrock Titan:

Query: "What guidance exists for adaptive trial designs?"
    │
    └─► Titan Embed → [0.12, -0.45, 0.78, ...] (1024 dimensions)

2. Vector Similarity Search

AWS S3 Vectors performs approximate nearest neighbor search against pre-indexed regulatory rules:

Query Vector → S3 Vectors Index
                    │
                    ├─► FDA-ADAPTIVE-2019 → similarity: 0.7769
                    ├─► ICH-E20 → similarity: 0.5411
                    └─► FDA-DMC-2024-DRAFT → similarity: 0.3955

3. Ranked Results

Results are returned sorted by semantic similarity with confidence scores:

{
  "results": [
    {
      "rule_id": "FDA-ADAPTIVE-2019",
      "title": "Adaptive Designs for Clinical Trials of Drugs and Biologics",
      "similarity_score": 0.7769,
      "source": "fda"
    }
  ]
}

Natural Language Query Examples

Regulatory Concept Queries

Query	Top Result	Score
"What are the requirements for informed consent in pediatric trials?"	FDA-INFORMED-CONSENT-2024	0.56
"How should I handle adverse event reporting?"	ICH-E2A	0.44
"What statistical methods are acceptable for phase 3?"	ICH-E9(R1)	0.46
"GCP guidelines for investigator responsibilities"	ICH-E6(R3)	0.55

Process-Oriented Queries

Query	Top Result	Score
"What guidance exists for adaptive trial designs?"	FDA-ADAPTIVE-2019	0.78
"How do I establish a Data Safety Monitoring Board?"	FDA-DMC-2024-DRAFT	0.58
"Explain protocol amendment procedures"	ICH-E6(R3)	0.47
"What training is required for clinical investigators?"	ICH-E6(R2)	0.45

Domain-Specific Queries

Query	Top Result	Score
"Tell me about blinding requirements in controlled trials"	ICH-E10	0.43
"What are the monitoring requirements for multi-site studies?"	ICH-E6(R3)	0.52
"How should biomarker data be collected and analyzed?"	ICH-E16	0.41

API Usage

Semantic Search Endpoint

POST Method (Recommended):

curl -X POST https://api.ctwise.ai/v1/semantic-search \
  -H "X-Api-Key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the requirements for informed consent in pediatric trials?",
    "sources": ["fda", "ich"],
    "top_k": 5,
    "min_similarity": 0.25
  }'

GET Method (Alternative via query parameters):

curl "https://api.ctwise.ai/v1/rules/search?q=informed+consent+pediatric&sources=fda,ich&limit=5" \
  -H "X-Api-Key: YOUR_API_KEY"

Request Parameters

Parameter	Type	Required	Description
`query`	string	Yes	Natural language question
`sources`	string[]	No	Filter by source (fda, ich, ema, who)
`top_k`	integer	No	Number of results (default: 5, max: 50)
`min_similarity`	float	No	Minimum similarity threshold (default: 0.25)

Response

{
  "query": "What are the requirements for informed consent in pediatric trials?",
  "results": [
    {
      "rule_id": "FDA-INFORMED-CONSENT-2024",
      "title": "Informed Consent: Guidance for IRBs, Clinical Investigators, and Sponsors",
      "source": "fda",
      "similarity_score": 0.5594,
      "effective_date": "2024-01-01"
    },
    {
      "rule_id": "ICH-E11(R1)",
      "title": "Clinical Investigation of Medicinal Products in the Pediatric Population",
      "source": "ich",
      "similarity_score": 0.5022,
      "effective_date": "2017-09-14"
    }
  ],
  "query_metadata": {
    "execution_time_ms": 380,
    "embedding_model": "amazon.titan-embed-text-v2:0",
    "indexes_searched": ["fda-tier1", "ich-tier1"],
    "total_results": 5
  }
}

Similarity Scoring

Score Interpretation

Score Range	Meaning	Recommendation
0.70+	High confidence match	Directly relevant
0.50-0.70	Good match	Review for relevance
0.25-0.50	Partial match	May be related
< 0.25	Below threshold	Not returned

Configuring Thresholds

For different use cases, adjust the min_similarity parameter:

Use Case	Threshold	Rationale
Broad discovery	0.20	Find loosely related rules
Standard search	0.25	Balanced precision/recall
Precise matching	0.40	High-confidence matches only

Cross-Source Discovery

Semantic search excels at finding related rules across different regulatory authorities:

A single query about "informed consent requirements" returns:

FDA Results:
├── FDA-INFORMED-CONSENT-2024 (0.56)
└── FDA-PEDIATRIC-2023 (0.44)

ICH Results:
├── ICH-E11(R1) (0.50) - Pediatric
├── ICH-E6(R3) (0.39) - GCP
└── ICH-E8(R1) (0.29) - General Considerations

Why this matters: Traditional keyword search would require separate queries to each regulatory body. Semantic search understands the concept spans multiple sources.

Performance Characteristics

Metric	Value	Notes
Average response time	380ms	Including embedding generation
P95 response time	Less than 600ms	Under load
Embedding model	Titan Text v2	1024 dimensions
Vector database	AWS S3 Vectors	Native AWS integration
Indexes available	FDA, ICH, EMA, WHO	Tier-dependent access

Best Practices

1. Ask Complete Questions

Good: "What are the requirements for informed consent in pediatric trials?"
Poor: "informed consent pediatric"

2. Include Context

Good: "What statistical methods are acceptable for phase 3 oncology trials?"
Poor: "statistics trials"

3. Use Natural Language

Good: "How should adverse events be reported to the FDA?"
Poor: "AE reporting FDA"

4. Specify Domains When Known

# If you know you want FDA guidance specifically
response = search(
    query="adaptive trial design requirements",
    sources=["fda"],  # Limits search scope
    top_k=10
)

Technology Stack

Component	Technology	Purpose
Embedding Model	Amazon Bedrock Titan Text Embeddings v2	1024-dimensional semantic encoding
Vector Database	AWS S3 Vectors	Cosine similarity search
Query Processing	AWS Lambda (ARM64)	Cost-optimized inference
Indexes	FDA-tier1, ICH-tier1, EMA-tier1, WHO-tier1	Pre-computed regulatory rule vectors

Verified Performance (2025-12-18)

Metric	Result
Tests executed	20 natural language queries
Success rate	95% (19/20 returned results)
Highest score	0.7769 ("adaptive trial designs")
Average score	0.41
Average response	380ms

Evidence: See /aws_mp_set_up/products/ctwise/nlp_evidence/NLP_EVIDENCE_SUMMARY.md

Requirements Search Endpoint - Keyword-based search
Getting Started Guide - First API call
Authentication - API key setup

Overview​

The Problem with Keyword Search​

The Semantic Search Advantage​

How It Works​

1. Query Embedding​

2. Vector Similarity Search​

3. Ranked Results​

Natural Language Query Examples​

Regulatory Concept Queries​

Process-Oriented Queries​

Domain-Specific Queries​

API Usage​

Semantic Search Endpoint​

Request Parameters​

Response​

Similarity Scoring​

Score Interpretation​

Configuring Thresholds​

Cross-Source Discovery​

Example: Informed Consent​

Performance Characteristics​

Best Practices​

1. Ask Complete Questions​

2. Include Context​

3. Use Natural Language​

4. Specify Domains When Known​

Technology Stack​

Verified Performance (2025-12-18)​

Related Documentation​