Modalities · fromm-m · Nov 6, 2025 · Nov 11, 2025 · AbasKhan · Dec 17, 2025
diff --git a/data/prompts/code/code_prompt.yaml b/data/prompts/code/code_prompt.yaml
@@ -0,0 +1,16 @@
+prompt: |
+  Below is an extract from a web page. Evaluate the instructional value of the extract for reasoning about programming / code using the additive 5‑point scoring system described below. Points are accumulated based on the depth and quality of reasoning support:
+
+  - Add 1 point if the extract contains source code relevant to programming topics but no substantial reasoning—e.g., raw or messy code with minimal/no comments, promotional snippet dumps, or copy‑pasted examples without context.
+  - Add another point if the code is cleanly structured and/or uses meaningful names or inline comments that reveal some abstraction or intent, even if no comprehensive narrative explanation is present.
+  - Award a third point if the extract provides a natural‑language explanation of what the code does—such as “explain in plain English” sections, problem–solution framing, or high‑level summaries clarifying purpose and behavior.
+  - Grant a fourth point if the extract includes a step‑by‑step/line‑by‑line trace or analysis that walks through variable states, control flow, or algorithmic steps, helping readers mentally execute the code.
+  - Bestow a fifth point if the extract delivers meta‑level or developmental reasoning—discussing design decisions, refactor motivations, trade‑offs, performance considerations, or describing the goals/strategies guiding the implementation process.
+
+  The extract:
+  {placeholder}
+  After examining the extract:
+  - Briefly justify your score, up to 100 words.
+  - Conclude with the score using the format: 'Code score: <total points>'
+
+prompt_name: code_content_filter
diff --git a/data/prompts/reasoning/reasoning_prompt.yaml b/data/prompts/reasoning/reasoning_prompt.yaml
@@ -0,0 +1,16 @@
+prompt: |
+  Below is an extract from a document. Evaluate the quality of its deliberative reasoning about a normative conflict using the additive 5‑point scoring system described below. Points are accumulated based on the depth and coherence of reflective‑equilibrium reasoning:
+
+  - Add 1 point if the extract explicitly identifies at least two norms, rules, or values in tension and describes the situation that creates the conflict.
+  - Add another point if it articulates the competing principles clearly and acknowledges the perspectives or interests of affected stakeholders.
+  - Award a third point if it supplies reason‑giving or empirical/contextual evidence that supports or challenges each principle, going beyond mere assertion.
+  - Grant a fourth point if it shows mutual adjustment toward coherence—weighing principles against each other, revising or limiting them, and addressing at least one objection.
+  - Bestow a fifth point if the extract reaches a reflective equilibrium: a balanced, publicly defensible resolution integrating principles, evidence, and stakeholder views.
+
+  The extract:
+  {placeholder}
+  After examining the extract:
+  - Briefly justify your score, up to 100 words.
+  - Conclude with the score using the format: 'Reasoning score: <total points>'
+
+prompt_name: reasoning_content_filter
diff --git a/data/prompts/safety/harmful_toxic_racist_sexist_adult.yaml b/data/prompts/safety/harmful_toxic_racist_sexist_adult.yaml
diff --git a/data/prompts/toxic/toxic_prompt.yaml b/data/prompts/toxic/toxic_prompt.yaml
@@ -0,0 +1,22 @@
+prompt: 
+  Below is an extract from a web page. Evaluate the toxicity level of the content using the additive 5-point scoring system described below. 
+  Points are accumulated based on the severity of harmful content:
+
+  - Add 1 point if the text contains mild profanity or rudeness that doesn't target individuals or groups. This includes casual swearing in non-aggressive contexts ("This is fucking brilliant!"), mild frustration ("damn it"), or slightly rude language without malicious intent.
+  - Add another point if the text includes moderate personal attacks, hostile language, or aggressive behavior directed at individuals. Examples include calling someone stupid, telling someone to "shut up", or other insults that attack a person's character or intelligence.
+  - Award a third point if the text contains identity-based attacks, discrimination, or harassment targeting protected groups (race, religion, gender, sexuality, nationality, etc.). This includes stereotypes ("All X are Y"), slurs, or content promoting discrimination. Also includes persistent harassment or bullying.
+  - Grant a fourth point if the text includes severe harmful content such as death wishes, suicide encouragement, or graphic descriptions of violence. Examples include "Kill yourself", "I hope you die", or content that could cause serious psychological harm.
+  - Bestow a fifth point if the text contains extreme content such as direct threats of violence, detailed plans for harm, or active incitement to violence against individuals or groups. This represents immediate danger requiring intervention.
+
+  Special considerations:
+  • Quoted or reported speech should reduce the score by 1 point
+  • Profanity alone without harmful intent should not score above 1
+  • Context matters - academic or news discussions of harmful topics score lower
+
+  The extract: {placeholder}
+
+  After examining the extract:
+  - Briefly justify your score, up to 100 words.
+  - Conclude with the score using the format: 'Toxicity score: <total points>'
+
+prompt_name: toxicity_content_filter