LuisHRF · Greta-galeana · Sep 10, 2024 · Sep 10, 2024 · Sep 11, 2024 · Sep 11, 2024
diff --git a/README.md b/README.md
@@ -1,2 +1,40 @@
 # Home-Insurance-Study-Project-
+
 Case study in which we investigate the hotspots of home burglaries, floods and earthquakes to determine places in Spain where home insurance should be increased.
+
+<p align="center">
+<img src="tiburon.gif" alt="Tiburon con hambre">
+</p>
+
+</p>
+
+## Team Members
+
+| Name             | LinkedIn Profile | Brief Description |
+|------------------|------------------|-------------------|
+| David Moreno     | In building      |  Economy Statistics & Data anlysis  |
+| Luis H. Rodríguez  | [\[Link\]](https://www.linkedin.com/in/luis-h-rodr%C3%ADguez-fuentes/) | Data Analyst |
+| Greta Galeana    | [\[Link\]](https://www.linkedin.com/in/gretagaleana?) | Marketing business & Data Analyst |
+
+### Business Problem
+As a leading insurance company in the Spanish market, we aim to leverage big data to enhance our competitiveness and the accuracy of our policies, particularly those related to home, vehicle ans stores insurance. To achieve this, we have decided to integrate a comprehensive analysis of crime statistics provided by official sources, such as the Ministry of the Interior.
+
+### Relevant links
+
+- Portal estadístico de criminalidad: [\[Link to source\]](https://estadisticasdecriminalidad.ses.mir.es/publico/portalestadistico/datos.html?type=pcaxis&path=/Datos1/&file=pcaxis)
+- Streamlit's dashboard: [\[Link to dashboard\]](https://crimesspain20102023.streamlit.app/)
+
+## Methodology
+
+
+
+## Tools & technology
+
+- **Programming Language**: SQL & Python
+- **Libraries**: Pandas, Folium, Numpy, Streamlit, Pillow, Seaborn
+- **Visualization**: Streamlit
+- **Data Storage**: CSV, xlsx.
+
+## Problems we have faced
+
+The first conflict we found was in the original file. The years, instead of being just another value, were stored in the same column as the provinces, so that after each year there was a whole row of nulls. This was solved by applying patterns.
diff --git a/__pycache__/functions.cpython-312.pyc b/__pycache__/functions.cpython-312.pyc
diff --git a/functions.py b/functions.py
@@ -10,3 +10,40 @@ def defi_years_per_block(data_frame, year_range):
 
     return data_frame_cp
 
+def cleaning_columns_replace(data_frame):
+    data_frame.columns = data_frame.columns.str.lower().str.replace(" ", "_")
+    data_frame = data_frame.rename(columns={
+    "_": "Province"})
+
+    return data_frame
+
+
+def traducir_columnas(columns_titles, translations):
+    return [translations.get(col, col) for col in columns_titles]
+#print(traducir_columnas(columns_titles, translations))
+
+
+def cleaning_rows_dataframe(data_frame, values_to_remove=None):
+    # Si no se especifican valores a eliminar, usa los predeterminados
+    if values_to_remove is None:
+        values_to_remove = ['Total Nacional', 'En el extranjero', 'Desconocida']
+
+    # Filtrar las filas que no contengan los valores a eliminar
+    cleaned_df = data_frame[~data_frame['columna_de_interes'].isin(values_to_remove)]
+
+    return cleaned_df
+
+def reset_index(data_frame):
+    data_frame.reset_index(drop=True, inplace=True)
+    data_frame = data_frame.drop(index=0)
+
+    return data_frame
+
+
+def convert_floats_to_ints(data_frame):
+    # Iterar sobre las columnas y convertir a int si son float
+    for column in data_frame.columns:
+        if data_frame[column].dtype == 'float64':  # Verifica si la columna es de tipo float
+            data_frame[column] = data_frame[column].astype(int)  # Convierte a int
+    return data_frame
+