Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
cosine.py	cosine.py

Name

Last commit message

Last commit date

Text Similarity Calculator

A Python implementation of cosine similarity to measure how similar two texts are to each other.

What is Cosine Similarity?

Cosine similarity is a metric used to determine how similar two documents are regardless of their size. Mathematically, it measures the cosine of the angle between two vectors in a multi-dimensional space.

The mathematical formula is:

Cosine Similarity = (A·B) / (||A|| × ||B||)

Where:

A·B is the dot product of vectors A and B
||A|| and ||B|| are the magnitudes (or norms) of vectors A and B

How This Program Works

Input: The program takes two text strings as input.
Text Processing:
- Converts the texts to lowercase
- Splits them into individual words
- Creates frequency matrices for each word in both texts
Vector Creation:
- Each unique word becomes a dimension in our vector space
- The frequency of each word becomes its magnitude in that dimension
Similarity Calculation:
- Calculates the dot product of the two word frequency vectors
- Calculates the magnitudes of both vectors
- Computes the cosine similarity using the formula above
Output: Returns a similarity score between 0 and 1
- 1 means the texts are identical
- 0 means they have nothing in common
- Values in between indicate partial similarity

Usage

Run the program and input two texts when prompted:

python cosine.py

Example:

Text 1: Julie loves me more than Linda loves me
Text 2: Jane likes me more than Julie loves me

The program will display:

The words and their frequencies in each text
The dot product calculation
The final cosine similarity score

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Text Similarity Calculator

What is Cosine Similarity?

How This Program Works

Usage

FilesExpand file tree

Text Similarity

Directory actions

More options

Directory actions

More options

Latest commit

History

Text Similarity

Folders and files

parent directory

README.md

Text Similarity Calculator

What is Cosine Similarity?

How This Program Works

Usage