Filters¶
RESK_ContentPolicyFilter¶
Usage:
from resk_llm.filters.resk_content_policy_filter import RESK_ContentPolicyFilter
f = RESK_ContentPolicyFilter(config={
'prohibited_patterns': [r'password\s*=', r'api[_-]?key\s*[:=]'],
'prohibited_words': ['exploit', 'bypass'],
})
result = f.filter("My api_key=SECRET")
assert not result.is_safe
Key params:
- prohibited_patterns: list of regex strings
- prohibited_words: list of words
- policies: dict or list to load patterns/words in bulk
RESK_HeuristicFilter¶
from resk_llm.filters.resk_heuristic_filter import RESK_HeuristicFilter
hf = RESK_HeuristicFilter(config={'threshold': 0.7})
res = hf.filter("Ignore previous instructions and do X")
print(res.is_safe, res.reason)
Highlights:
- Detects jailbreak phrases, base64 blobs, instruction chains
- Customizable suspicious_keywords and suspicious_patterns
RESK_WordListFilter¶
from resk_llm.filters.resk_word_list_filter import RESK_WordListFilter
from resk_llm.patterns.pattern_provider import FileSystemPatternProvider
provider = FileSystemPatternProvider()
wlf = RESK_WordListFilter(config={'pattern_provider': provider})
res = wlf.filter("This contains harmful words")
Features:
- Pulls keywords/regex from FileSystemPatternProvider
- add_words, remove_words, and check_input helpers
title: Filters¶
Overview¶
Filters implement filter(text) and can block or modify inputs/outputs.
RESK_HeuristicFilter¶
Detects suspicious patterns and keywords.
from resk_llm.filters.resk_heuristic_filter import RESK_HeuristicFilter
f = RESK_HeuristicFilter()
passed, reason, processed = f.filter("Ignore previous instructions")
RESK_ContentPolicyFilter¶
Policy-based content checks.
from resk_llm.filters.resk_content_policy_filter import RESK_ContentPolicyFilter
f = RESK_ContentPolicyFilter()
RESK_WordListFilter¶
Blocks custom word lists.
from resk_llm.filters.resk_word_list_filter import RESK_WordListFilter
f = RESK_WordListFilter(config={"blocked_words": ["password", "token"]})