Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions data/prompts/code/code_prompt.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
prompt: |
Below is an extract from a web page. Evaluate the instructional value of the extract for reasoning about programming / code using the additive 5‑point scoring system described below. Points are accumulated based on the depth and quality of reasoning support:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one has non non-ascii - 5‑point , to avoid any unexpected issues i would replace it with the regular -


- Add 1 point if the extract contains source code relevant to programming topics but no substantial reasoning—e.g., raw or messy code with minimal/no comments, promotional snippet dumps, or copy‑pasted examples without context.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another minor suggestion, we are now using em dashes to elaborate , and in other cases we use Lists by using -. I dont think this is critical, but for consistency i would suggest we have a standardized option and always use that

- Add another point if the code is cleanly structured and/or uses meaningful names or inline comments that reveal some abstraction or intent, even if no comprehensive narrative explanation is present.
- Award a third point if the extract provides a natural‑language explanation of what the code does—such as “explain in plain English” sections, problem–solution framing, or high‑level summaries clarifying purpose and behavior.
- Grant a fourth point if the extract includes a step‑by‑step/line‑by‑line trace or analysis that walks through variable states, control flow, or algorithmic steps, helping readers mentally execute the code.
- Bestow a fifth point if the extract delivers meta‑level or developmental reasoning—discussing design decisions, refactor motivations, trade‑offs, performance considerations, or describing the goals/strategies guiding the implementation process.

The extract:
{placeholder}
After examining the extract:
- Briefly justify your score, up to 100 words.
- Conclude with the score using the format: 'Code score: <total points>'

prompt_name: code_content_filter
16 changes: 16 additions & 0 deletions data/prompts/reasoning/reasoning_prompt.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
prompt: |
Below is an extract from a document. Evaluate the quality of its deliberative reasoning about a normative conflict using the additive 5‑point scoring system described below. Points are accumulated based on the depth and coherence of reflective‑equilibrium reasoning:

- Add 1 point if the extract explicitly identifies at least two norms, rules, or values in tension and describes the situation that creates the conflict.
- Add another point if it articulates the competing principles clearly and acknowledges the perspectives or interests of affected stakeholders.
- Award a third point if it supplies reason‑giving or empirical/contextual evidence that supports or challenges each principle, going beyond mere assertion.
- Grant a fourth point if it shows mutual adjustment toward coherence—weighing principles against each other, revising or limiting them, and addressing at least one objection.
- Bestow a fifth point if the extract reaches a reflective equilibrium: a balanced, publicly defensible resolution integrating principles, evidence, and stakeholder views.

The extract:
{placeholder}
After examining the extract:
- Briefly justify your score, up to 100 words.
- Conclude with the score using the format: 'Reasoning score: <total points>'

prompt_name: reasoning_content_filter
Comment on lines +1 to +16
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one has non non-ascii - 5‑point , to avoid any unexpected issues i would replace it with the regular -

28 changes: 0 additions & 28 deletions data/prompts/safety/harmful_toxic_racist_sexist_adult.yaml

This file was deleted.

22 changes: 22 additions & 0 deletions data/prompts/toxic/toxic_prompt.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
prompt:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a pipe symbole like prompt: | (As is the case with other prompts) otherwise the compiler will compain about using stuff like inside The extract:

Below is an extract from a web page. Evaluate the toxicity level of the content using the additive 5-point scoring system described below.
Points are accumulated based on the severity of harmful content:

- Add 1 point if the text contains mild profanity or rudeness that doesn't target individuals or groups. This includes casual swearing in non-aggressive contexts ("This is fucking brilliant!"), mild frustration ("damn it"), or slightly rude language without malicious intent.
- Add another point if the text includes moderate personal attacks, hostile language, or aggressive behavior directed at individuals. Examples include calling someone stupid, telling someone to "shut up", or other insults that attack a person's character or intelligence.
- Award a third point if the text contains identity-based attacks, discrimination, or harassment targeting protected groups (race, religion, gender, sexuality, nationality, etc.). This includes stereotypes ("All X are Y"), slurs, or content promoting discrimination. Also includes persistent harassment or bullying.
- Grant a fourth point if the text includes severe harmful content such as death wishes, suicide encouragement, or graphic descriptions of violence. Examples include "Kill yourself", "I hope you die", or content that could cause serious psychological harm.
- Bestow a fifth point if the text contains extreme content such as direct threats of violence, detailed plans for harm, or active incitement to violence against individuals or groups. This represents immediate danger requiring intervention.

Special considerations:
• Quoted or reported speech should reduce the score by 1 point
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is minor , but i think we use consistently use - for list like instruction

• Profanity alone without harmful intent should not score above 1
• Context matters - academic or news discussions of harmful topics score lower

The extract: {placeholder}

After examining the extract:
- Briefly justify your score, up to 100 words.
- Conclude with the score using the format: 'Toxicity score: <total points>'

prompt_name: toxicity_content_filter