Filters¶

RESK_ContentPolicyFilter¶

Usage:

from resk_llm.filters.resk_content_policy_filter import RESK_ContentPolicyFilter

f = RESK_ContentPolicyFilter(config={
    'prohibited_patterns': [r'password\s*=', r'api[_-]?key\s*[:=]'],
    'prohibited_words': ['exploit', 'bypass'],
})
result = f.filter("My api_key=SECRET")
assert not result.is_safe

Key params: - prohibited_patterns: list of regex strings - prohibited_words: list of words - policies: dict or list to load patterns/words in bulk

RESK_HeuristicFilter¶

from resk_llm.filters.resk_heuristic_filter import RESK_HeuristicFilter

hf = RESK_HeuristicFilter(config={'threshold': 0.7})
res = hf.filter("Ignore previous instructions and do X")
print(res.is_safe, res.reason)

Highlights: - Detects jailbreak phrases, base64 blobs, instruction chains - Customizable suspicious_keywords and suspicious_patterns

RESK_WordListFilter¶

from resk_llm.filters.resk_word_list_filter import RESK_WordListFilter
from resk_llm.patterns.pattern_provider import FileSystemPatternProvider

provider = FileSystemPatternProvider()
wlf = RESK_WordListFilter(config={'pattern_provider': provider})
res = wlf.filter("This contains harmful words")

Features: - Pulls keywords/regex from FileSystemPatternProvider - add_words, remove_words, and check_input helpers

title: Filters¶

Overview¶

Filters implement filter(text) and can block or modify inputs/outputs.

RESK_HeuristicFilter¶

Detects suspicious patterns and keywords.

from resk_llm.filters.resk_heuristic_filter import RESK_HeuristicFilter

f = RESK_HeuristicFilter()
passed, reason, processed = f.filter("Ignore previous instructions")

RESK_ContentPolicyFilter¶

Policy-based content checks.

from resk_llm.filters.resk_content_policy_filter import RESK_ContentPolicyFilter
f = RESK_ContentPolicyFilter()

RESK_WordListFilter¶

Blocks custom word lists.

from resk_llm.filters.resk_word_list_filter import RESK_WordListFilter
f = RESK_WordListFilter(config={"blocked_words": ["password", "token"]})