๐ŸŒˆ ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๊ฐ€ ๋˜๊ธฐ ์œ„ํ•œ ๋กœ๋“œ๋งต(Roadmap to becoming a data engineer) - ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๊ฐ€ ๋˜๊ธฐ ์œ„ํ•œ ๋กœ๋“œ๋งต 2021๋…„ ๋ฒ„์ „์ž…๋‹ˆ๋‹ค - Github ๋งํฌ๊ณ , ํ˜„์žฌ Star๋ฅผ 6800๊ฐœ ๋ฐ›์€ ๋ฌธ์„œ์ž…๋‹ˆ๋‹ค :) - ์˜๋ฌธ ์ž๋ฃŒ์ง€๋งŒ ํ‚ค์›Œ๋“œ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์–ด์„œ ๋งค์šฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค! ๐ŸŽ ์ด ๊ธ€์„ ์ถ”์ฒœํ•˜๋Š” ์ด์œ  - ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๊ฐ€ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉด ์ข‹์€ ์—ญ๋Ÿ‰๋“ค์„ ๊ธฐ์ˆ  ์Šคํƒ์œผ๋กœ ๋‚˜์—ดํ•œ ๊ธ€์ž…๋‹ˆ๋‹ค - ํ‚ค์›Œ๋“œ๋ฅผ ์•„์‹œ๋ฉด ๋งค์šฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค! - ์ €๋„ ์•Œ๊ณ  ์žˆ๋Š” ํ‚ค์›Œ๋“œ์™€ ์•Œ์ง€ ๋ชปํ•˜๋Š” ํ‚ค์›Œ๋“œ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์–ด์„œ ์œ ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค! - ๋„ํ‘œ๋กœ ๋‚˜์™€์žˆ์–ด์„œ ์‰ฝ๊ฒŒ ํ™•์ธํ•˜์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค :) ๐Ÿ‘ ์ฝ์œผ๋ฉด ์ข‹์€ ๋ถ„ - ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ฅผ ํฌ๋งํ•˜์‹œ๋Š” ๋ถ„๋“ค - ๋ฐ์ดํ„ฐ ์—”์ง€๋‹ˆ์–ด๋ฅผ ํ•˜๊ณ  ๊ณ„์‹ ๋ฐ, ์ปค๋ฆฌ์–ด์— ๋Œ€ํ•œ ๊ณ ๋ฏผ์„ ํ•˜๊ณ  ๊ณ„์‹  ๋ถ„ ๐Ÿ“‹ ๋‚ด์šฉ - CS ๊ธฐ์ดˆ - ํ„ฐ๋ฏธ๋„ ์‚ฌ์šฉ๋ฒ• - ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ ๋ฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜ - APIs - REST - Structured vs Unstructured data - Serialization(์ง๋ ฌํ™”) - Linux - ์ปดํ“จํ„ฐ๊ฐ€ ๋™์ž‘ํ•˜๋Š” ๋ฐฉ๋ฒ•์€? - ์ธํ„ฐ๋„ท์ด ๋™์ž‘ํ•˜๋Š” ๋ฐฉ๋ฒ•์€? - Git ์‚ฌ์šฉ๋ฒ• - ์ˆ˜ํ•™, ํ†ต๊ณ„ํ•™ ๊ธฐ์ดˆ - ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด - ํŒŒ์ด์ฌ - ์ž๋ฐ” - ์Šค์นผ๋ผ - ๊ณ  - Test - ์œ ๋‹› ํ…Œ์ŠคํŠธ - ํ†ตํ•ฉ ํ…Œ์ŠคํŠธ - ํ•จ์ˆ˜ ํ…Œ์ŠคํŠธ - ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ๊ธฐ์ดˆ - SQL - Normalization - ACID Transaction - CAP ์ •๋ฆฌ - OLTP vs OLAP - Horizontal vs vertical scaling - Dimensional modeling : ๋ฐ์ดํ„ฐ ์›จ์–ด ํ•˜์šฐ์Šค ๋””์ž์ธ์— ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ• - ๊ด€๊ณ„ํ˜• ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค - MySQL - PostgreSQL - MariaDB - AWS Aurora - ๋น„์ •ํ˜• ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค - Document ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค - MongoDB - Elasticsearch - Apache CouchDB - Azure CormosDB - Wide Column ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค : ํ–‰๋งˆ๋‹ค ํ‚ค์™€ ๊ฐ’์„ ์ €์žฅํ•  ๋•Œ ๊ฐ๊ฐ ๋‹ค๋ฅธ ๊ฐ’์˜ ๋‹ค๋ฅธ ์ˆ˜ ์Šคํ‚ค๋งˆ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Œ - Apache Cassandra - Apache HBase -Google Cloud BigTable - ๊ทธ๋ž˜ํ”„ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค - Neo4J - Amazon Neptune - Key Value Store - Redis - Memcached - Amazon DynamoDB - ๋ฐ์ดํ„ฐ ์›จ์–ดํ•˜์šฐ์Šค - Snowflake - Presto - Apache Hive - Apache Impala - Amazon Redshift - Google BigQuery - Azure Synapse - ClickHouse - Object ์ €์žฅ์†Œ - AWS S3 - Azure Blob Storage - Google Cloud Storage - ํด๋Ÿฌ์Šคํ„ฐ ์ปดํ“จํŒ… ๊ธฐ์ดˆ - Apache Hadoop - HDFS - MapReduce - Lambda & Kappa ์•„ํ‚คํ…์ณ - Managed Hadoop - Amazon EMR - Google Dataproc - Azure Data Lake - ๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ(Data processing) - Batch - Apache Pig - Apache Arrow - Data build tool - Hybrid : Batch์™€ Streaming ๋ชจ๋‘ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ํ•˜์ด๋ธŒ๋ฆฌ๋“œ - Apache Spark - Apache Beam - Apache Flink - Apache NiFi - Streaming - Apache Kafka [personal recommendation] - Apache Storm [general recommendation] - Apache Samza - Amazon Kinesis - Messaging - RabbitMQ [general recommendation] - Apache ActiveMQ - Amazon SNS & SQS - Google PubSub - Azure Service Bus - Workflow scheduling - Apache Airflow - Google Composer - Apache Oozie - Luigi - Monitoring data pipelines - Prometheus - Datadog - Sentry - StatsD - Networking - Protocols - HTTP / HTTPS - TCP - SSH - IP - DNS - Firewalls - VPN - VPC - Infrastructure as Code - Containers - Docker - LXC - Container orchestration - Kubernetes - Docker Swarm - Apache Mesos - Google Kubernetes Engine (GKE) - Infrastructure provisioning - Terraform - Pulumi - AWS CDK - CI/CD - GitHub Actions - Jenkins - Identity and access management - Active Directory - Azure Active Directory - Data security & privacy - Legal compliance - Encryption - Key management - Data governance & integrity

datastacktv/data-engineer-roadmap

GitHub

datastacktv/data-engineer-roadmap

๋‹ค์Œ ๋‚ด์šฉ์ด ๊ถ๊ธˆํ•˜๋‹ค๋ฉด?

๋˜๋Š”

์ด๋ฏธ ํšŒ์›์ด์‹ ๊ฐ€์š”?

2021๋…„ 2์›” 27์ผ ์˜คํ›„ 1:38

ย โ€ขย 

์ €์žฅ 167 โ€ข ์กฐํšŒ 7,586

๋Œ“๊ธ€ 1




๋น„์Šทํ•œ ๊ฒŒ์‹œ๋ฌผ

์ฃผ๊ฐ„ ์ธ๊ธฐ TOP 10

1

ํ•œ์ •์ˆ˜ Software Engineer

Netflix์˜ Java ์‚ฌ์šฉ ๋ณ€์ฒœ์‚ฌ (๋ฒˆ์—ญ)

2

์„œ์˜ ํ”„๋ก ํŠธ์—”๋“œ ๊ฐœ๋ฐœ์ž

๊ฐœ๋ฐœ์ž๊ฐ€ ์‹ค์ƒํ™œ์—์„œ ์‹ค์ฒœํ•  ์ˆ˜ ์žˆ๋Š” ์Šคํ„ฐ๋”” ๋ฐฉ๋ฒ• ์†Œ๊ฐœ ๐Ÿ”‘

3

ํ•œ๊ธฐ์šฉ UpZen ์ฐฝ์—…์ž ๋Œ€ํ‘œ

์ข‹์€ ๊ฐœ๋ฐœ์ž๊ฐ€ ์•Œ์•„์•ผํ•˜๋Š” 9๊ฐ€์ง€ ํฌ์ธํŠธ๋“ค - 2. ํ•™์Šต ๋Šฅ๋ ฅ ํ‚ค์šฐ

4

๊ณจ๋นˆํ•ด์ปค Lead of Generative AI

๐Ÿ“ ๋ฉ”ํƒ€์˜ ์Šค๋ ˆ๋“œ, ๋ชจ๋†€๋ฆฌํ‹ฑ ์•„ํ‚คํ…์ฒ˜ ์‚ฌ์šฉ

5

๋‹ฌ๋ ˆ ๋ธ”๋กœ๊ทธ ์“ฐ๋Š” ๊ฐœ๋ฐœ์ž โœ๏ธ

๐ŸฅŸ Bun v1.1 ์ถœ์‹œ ์†Œ์‹

์ถ”์ฒœ ํ”„๋กœํ•„