Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,40 @@
# Home-Insurance-Study-Project-

Case study in which we investigate the hotspots of home burglaries, floods and earthquakes to determine places in Spain where home insurance should be increased.

<p align="center">
<img src="tiburon.gif" alt="Tiburon con hambre">
</p>

</p>

## Team Members

| Name | LinkedIn Profile | Brief Description |
|------------------|------------------|-------------------|
| David Moreno | In building | Economy Statistics & Data anlysis |
| Luis H. Rodríguez | [\[Link\]](https://www.linkedin.com/in/luis-h-rodr%C3%ADguez-fuentes/) | Data Analyst |
| Greta Galeana | [\[Link\]](https://www.linkedin.com/in/gretagaleana?) | Marketing business & Data Analyst |

### Business Problem
As a leading insurance company in the Spanish market, we aim to leverage big data to enhance our competitiveness and the accuracy of our policies, particularly those related to home, vehicle ans stores insurance. To achieve this, we have decided to integrate a comprehensive analysis of crime statistics provided by official sources, such as the Ministry of the Interior.

### Relevant links

- Portal estadístico de criminalidad: [\[Link to source\]](https://estadisticasdecriminalidad.ses.mir.es/publico/portalestadistico/datos.html?type=pcaxis&path=/Datos1/&file=pcaxis)
- Streamlit's dashboard: [\[Link to dashboard\]](https://crimesspain20102023.streamlit.app/)

## Methodology



## Tools & technology

- **Programming Language**: SQL & Python
- **Libraries**: Pandas, Folium, Numpy, Streamlit, Pillow, Seaborn
- **Visualization**: Streamlit
- **Data Storage**: CSV, xlsx.

## Problems we have faced

The first conflict we found was in the original file. The years, instead of being just another value, were stored in the same column as the provinces, so that after each year there was a whole row of nulls. This was solved by applying patterns.
Binary file modified __pycache__/functions.cpython-312.pyc
Binary file not shown.
37 changes: 37 additions & 0 deletions functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,40 @@ def defi_years_per_block(data_frame, year_range):

return data_frame_cp

def cleaning_columns_replace(data_frame):
data_frame.columns = data_frame.columns.str.lower().str.replace(" ", "_")
data_frame = data_frame.rename(columns={
"_": "Province"})

return data_frame


def traducir_columnas(columns_titles, translations):
return [translations.get(col, col) for col in columns_titles]
#print(traducir_columnas(columns_titles, translations))


def cleaning_rows_dataframe(data_frame, values_to_remove=None):
# Si no se especifican valores a eliminar, usa los predeterminados
if values_to_remove is None:
values_to_remove = ['Total Nacional', 'En el extranjero', 'Desconocida']

# Filtrar las filas que no contengan los valores a eliminar
cleaned_df = data_frame[~data_frame['columna_de_interes'].isin(values_to_remove)]

return cleaned_df

def reset_index(data_frame):
data_frame.reset_index(drop=True, inplace=True)
data_frame = data_frame.drop(index=0)

return data_frame


def convert_floats_to_ints(data_frame):
# Iterar sobre las columnas y convertir a int si son float
for column in data_frame.columns:
if data_frame[column].dtype == 'float64': # Verifica si la columna es de tipo float
data_frame[column] = data_frame[column].astype(int) # Convierte a int
return data_frame

Loading