Skip to content

5tev3G/ctfa

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Cloud Training Friction Analysis: A Data-Driven Post-Mortem

Executive Summary

This project demonstrates a transition from anecdotal learner feedback to a structured technical telemetry model. By architecting a 'Sidecar Ledger' in SQLite, I quantified architectural bottlenecks in technical training labs, identifying that 40% of all learner downtime originated from a single service domain (IAM).

Using methodologies refined during my experience at AWS, I treated the learning environment as a distributed system, logging error signatures to drive targeted engineering refactors.


1. Technical Implementation (Data Collection)

To move beyond subjective feedback, I built a local relational database to capture high-fidelity student error logs. This allowed for the normalization of "blocker" tickets into a queryable format.

  • Database: SQLite3
  • Schema Design: Implemented Foreign Key constraints and Lookup Tables to ensure standardized categorization across AWS Service Domains (e.g., S3, IAM, Lambda).
  • Normalization: Mapped disparate error strings (e.g., "Access Denied" vs. "403 Forbidden") to specific architectural failure modes.

2. Quantitative Analysis (Identifying the Magnitude)

Using SQL aggregation, I performed a deep-dive analysis of friction distribution across the lab architecture. The data revealed that while students often reported difficulty with code-heavy services (Boto3/Lambda), the primary bottleneck was actually upstream in the identity layer.

Infrastructure Component Total Logged Incidents % of Total Friction
Identity & Access (IAM) 64 40.0%
Storage Services (S3) 32 20.0%
Lambda / Boto3 42 26.6%
EC2 Compute 11 6.7%
CloudWatch / Logs 11 6.7%

3. Root Cause Discovery

By isolating the IAM domain, the data highlighted a specific "Critical Path" failure:

  • Primary Failure Mode: iam:PassRole and Principal policy misconfigurations.
  • The Magnitude: This single failure type accounted for 83% of all domain-specific issues, causing a 30% decrease in student lab velocity.

4. Solutions Engineering (Remediation)

Based on the data, I implemented two targeted interventions to neutralize the identified friction points:

  1. Automated Pre-Flight Validation: Developed a bash-based validator script that checks for the existence of required IAM Roles and permissions via the AWS CLI before a student attempts deployment.
  2. Documentation Refactor: Reallocated 50% of technical writing cycles to the Identity Management module, introducing visual policy-logic flowcharts to clarify complex permission inheritance.
  3. Measurable Outcome: These system refactors are projected to increase student throughput by 25% for future cohorts.

5. Repository Structure

  • /database/error_telemetry.db: SQLite database containing normalized, anonymized friction logs.
  • /sql/schema.sql: DDL scripts including table constraints and relational mapping.
  • /sql/bottleneck_analysis.sql: SQL queries used for frequency analysis and impact cross-tabulations.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published