Within the execution for Yellow Taxi data for the year 2020 and month 12: what is the uncompressed file size?
wget https://github.com/DataTalksClub/nyc-tlc-data/releases/download/yellow/yellow_tripdata_2020-12.csv.gz
gunzip yellow_tripdata_2020-12.csv.gz
ls -lh yellow_tripdata_2020-12.csvAnswer: 128.3 MiB
What is the rendered value of the variable file when taxi=green, year=2020, month=04?
The Kestra template {{inputs.taxi}}_tripdata_{{inputs.year}}-{{inputs.month}}.csv renders the input values directly.
Answer: green_tripdata_2020-04.csv
How many rows are there for the Yellow Taxi data for all CSV files in the year 2020?
total=0
for month in 01 02 03 04 05 06 07 08 09 10 11 12; do
count=$(zcat "yellow_tripdata_2020-${month}.csv.gz" | wc -l)
count=$((count - 1)) # Subtract header
total=$((total + count))
done
echo "Total: ${total}"Monthly breakdown:
| Month | Rows |
|---|---|
| 01 | 6,405,008 |
| 02 | 6,299,354 |
| 03 | 3,007,292 |
| 04 | 237,993 |
| 05 | 348,371 |
| 06 | 549,760 |
| 07 | 800,412 |
| 08 | 1,007,284 |
| 09 | 1,341,012 |
| 10 | 1,681,131 |
| 11 | 1,508,985 |
| 12 | 1,461,897 |
Answer: 24,648,499
How many rows are there for the Green Taxi data for all CSV files in the year 2020?
Monthly breakdown:
| Month | Rows |
|---|---|
| 01 | 447,770 |
| 02 | 398,632 |
| 03 | 223,406 |
| 04 | 35,612 |
| 05 | 57,360 |
| 06 | 63,109 |
| 07 | 72,257 |
| 08 | 81,063 |
| 09 | 87,987 |
| 10 | 95,120 |
| 11 | 88,605 |
| 12 | 83,130 |
Answer: 1,734,051
How many rows are there for the Yellow Taxi data for the March 2021 CSV file?
wget https://github.com/DataTalksClub/nyc-tlc-data/releases/download/yellow/yellow_tripdata_2021-03.csv.gz
zcat yellow_tripdata_2021-03.csv.gz | wc -l
# Subtract 1 for headerAnswer: 1,925,152
How would you configure the timezone to New York in a Schedule trigger?
In Kestra, timezones must be specified using IANA timezone format (e.g., America/New_York), not abbreviations like EST or offsets like UTC-5.
Example configuration:
triggers:
- id: schedule
type: io.kestra.plugin.core.trigger.Schedule
cron: "0 9 * * *"
timezone: America/New_YorkAnswer: Add a timezone property set to America/New_York in the Schedule trigger configuration
| Question | Answer |
|---|---|
| Q1 | 128.3 MiB |
| Q2 | green_tripdata_2020-04.csv |
| Q3 | 24,648,499 |
| Q4 | 1,734,051 |
| Q5 | 1,925,152 |
| Q6 | timezone property set to America/New_York |