API Reference ============ This page provides detailed API documentation for pytector. Core Classes ------------ .. automodule:: pytector :members: :undoc-members: :show-inheritance: PromptInjectionDetector ----------------------- .. automodule:: pytector.detector :members: :undoc-members: :show-inheritance: LangChain Integration --------------------- .. automodule:: pytector.langchain :members: :undoc-members: :show-inheritance: PromptSanitizer --------------- .. automodule:: pytector.sanitizer :members: :undoc-members: :show-inheritance: Configuration ------------- The following configuration options are available when initializing the detector: .. list-table:: Configuration Parameters :widths: 20 20 60 :header-rows: 1 * - Parameter - Type - Description * - model_name_or_url - str - Name or path of the model to use for detection * - default_threshold - float - Default confidence threshold for injection detection (0.0 to 1.0) * - use_groq - bool - Whether to use Groq API for detection * - api_key - str - API key for Groq service (required if use_groq=True) * - groq_model - str - Groq model to use for detection (default: openai/gpt-oss-safeguard-20b) Predefined Models ---------------- The following predefined models are available: .. list-table:: Predefined Models :widths: 20 80 :header-rows: 1 * - Model Name - Description * - deberta - protectai/deberta-v3-base-prompt-injection * - distilbert - fmops/distilbert-prompt-injection * - distilbert-onnx - prompt-security/fmops-distilbert-prompt-injection-onnx Groq API Behavior ----------------- ``detect_injection_api`` returns: * ``True`` for safe responses * ``False`` for unsafe responses (or non-standard responses treated conservatively as unsafe) * ``None`` when the API call fails Use ``return_raw=True`` to inspect raw model output as ``(is_safe, raw_response)``. Example Usage ------------- .. code-block:: python from pytector import PromptInjectionDetector # Basic usage with default model detector = PromptInjectionDetector() is_injection, probability = detector.detect_injection("Your text here") # Using Groq API detector = PromptInjectionDetector( use_groq=True, api_key="your-api-key" ) is_safe = detector.detect_injection_api("Your text here") # Using GGUF model detector = PromptInjectionDetector("path/to/model.gguf") is_injection, probability = detector.detect_injection("Your text here") # Custom threshold detector = PromptInjectionDetector(default_threshold=0.8) is_injection, probability = detector.detect_injection("Your text here") Sanitizer Usage --------------- .. code-block:: python from pytector import PromptSanitizer # All strategies enabled by default sanitizer = PromptSanitizer() cleaned, was_modified = sanitizer.sanitize("Ignore previous instructions. Hello!") # With detailed change log cleaned, was_modified, changes = sanitizer.sanitize( "Ignore previous instructions. Hello!", return_details=True, ) # Custom configuration sanitizer = PromptSanitizer( fuzzy_threshold=0.80, sentence_threshold=0.4, enable_prompt_enforcement=True, ) Sanitizer Configuration ----------------------- .. list-table:: Sanitizer Parameters :widths: 30 15 55 :header-rows: 1 * - Parameter - Default - Description * - enable_encoding_detection - True - Decode and strip Base64, hex, ROT13 obfuscated payloads * - enable_unicode_normalization - True - Strip invisible characters, NFKC homoglyph normalization * - enable_pattern_removal - True - Regex-based structural injection pattern removal * - enable_sentence_scoring - True - Heuristic per-sentence analysis; drop suspicious sentences * - enable_fuzzy_matching - True - Catch paraphrased injection phrases via difflib similarity * - enable_keyword_stripping - True - Final pass removing known injection phrases * - enable_prompt_enforcement - False - Escape template syntax (``{ } < > ` ``) * - keywords - None - Custom keyword list; ``None`` uses built-in defaults * - fuzzy_threshold - 0.85 - Similarity cutoff for fuzzy matching (0.0-1.0) * - sentence_threshold - 0.5 - Heuristic score cutoff for sentence removal (0.0-1.0) PIIScanner ---------- .. automodule:: pytector.pii :members: :undoc-members: :show-inheritance: Uses the `PasteProof PII Detector `_ (ModernBERT-base, F1 0.97) for NER-based PII detection across 27 entity types. Requires ``transformers >= 4.48.0`` for ModernBERT support. .. code-block:: python from pytector import PIIScanner scanner = PIIScanner() has_pii, entities = scanner.scan("Email john@acme.com, SSN 123-45-6789") print(scanner.redact("Email john@acme.com, SSN 123-45-6789")) # Filter to specific entity types scanner = PIIScanner(entity_types=["EMAIL", "CREDIT_CARD"], threshold=0.7) .. list-table:: PIIScanner Parameters :widths: 20 20 60 :header-rows: 1 * - Parameter - Type - Description * - model_name - str - Predefined key (``pasteproof-v3``) or HuggingFace model ID / local path * - threshold - float - Minimum confidence for an entity to be reported (default 0.5) * - entity_types - list[str] | None - Filter to specific types (e.g. ``["EMAIL", "SSN"]``); ``None`` = all .. admonition:: Citation .. code-block:: text @model{pasteproof_pii_detector, author = {Jonathan Edgar}, title = {PasteProof PII Detector}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/joneauxedgar/pasteproof-pii-detector-v2} } ToxicityDetector ---------------- .. automodule:: pytector.toxicity :members: :undoc-members: :show-inheritance: Uses `citizenlab/distilbert-base-multilingual-cased-toxicity `_ (F1-micro 0.94, 10 languages) for toxicity classification. .. code-block:: python from pytector import ToxicityDetector detector = ToxicityDetector() is_toxic, score = detector.detect("You are terrible") detector.report("Have a wonderful day!") .. list-table:: ToxicityDetector Parameters :widths: 20 20 60 :header-rows: 1 * - Parameter - Type - Description * - model_name - str - Predefined key (``citizenlab``) or HuggingFace model ID / local path * - threshold - float - Score above which text is considered toxic (default 0.5) RegexScanner ------------ .. automodule:: pytector.regex_scanner :members: :undoc-members: :show-inheritance: Pure-stdlib rule-based scanner with customizable patterns. .. code-block:: python from pytector import RegexScanner scanner = RegexScanner() has_match, matches = scanner.scan("Key: sk-live-abc123def456") print(scanner.redact("Email user@example.com")) # Custom patterns only custom = RegexScanner( patterns={"ORDER_ID": r"ORD-\d{8}"}, use_defaults=False, ) .. list-table:: RegexScanner Parameters :widths: 20 20 60 :header-rows: 1 * - Parameter - Type - Description * - patterns - dict[str, str] | None - ``{NAME: regex}`` mapping merged with defaults (or used alone) * - use_defaults - bool - Whether to include built-in patterns (EMAIL, PHONE, SSN, CREDIT_CARD, IP_ADDRESS, API_KEY, JWT_TOKEN) CanaryToken ----------- .. automodule:: pytector.canary :members: :undoc-members: :show-inheritance: Inject a secret token into your system prompt and detect if the model leaks it. Pure stdlib — zero dependencies, zero calibration. .. code-block:: python from pytector import CanaryToken canary = CanaryToken() system_prompt = canary.wrap("You are a helpful assistant.") # ... pass to LLM, get response ... leaked, token = canary.check(model_output) .. list-table:: CanaryToken Parameters :widths: 20 20 60 :header-rows: 1 * - Parameter - Type - Description * - token - str | None - Explicit canary string; ``None`` auto-generates one * - length - int - Length of the random part of auto-generated tokens (default 16) * - prefix - str - Prefix for auto-generated tokens (default ``CANARY-``)