Community

๐Ÿ™Œ Lyft์˜ Streaming Pipeline ํผํฌ๋จผ์Šค ๊ฐœ์„ ํ•˜๊ธฐ

์ŠคํŠธ๋ฆฌ๋ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ตฌํ˜„ํ• ๋•Œ๋Š” ์„ฑ๋Šฅ์ด๋‚˜ ํ™•์žฅ์„ฑ์— ๋Œ€ํ•œ ๊ณ ๋ ค๋„ ์ถฉ๋ถ„ํžˆ ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๊ธฐ ์œ„ํ•ด์„œ๋Š” ์–ด๋А ๋ถ€๋ถ„์—์„œ ๋ณ‘๋ชฉ์ด ๋ฐœ์ƒํ•˜๊ณ  ์žˆ๋Š”์ง€ ์•Œ์•„์•ผ ํ•˜๊ณ  ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ณ‘๋ชฉ๊ตฌ๊ฐ„์„ ํ•ด๊ฒฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. Lyft์—์„œ ์ŠคํŠธ๋ฆฌ๋ฐ ํŒŒ์ดํ”„๋ผ์ธ์„ ๊ตฌํ˜„ํ• ๋•Œ ์–ด๋–ป๊ฒŒ ํ”„๋กœํŒŒ์ผ๋ง์„ ํ•ด์„œ ์„ฑ๋Šฅ์„ ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ๋Š”์ง€, ๊ทธ๋ฆฌ๊ณ  ์ผ๋ฐ˜์ ์œผ๋กœ ์ŠคํŠธ๋ฆฌ๋ฐ ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์„ฑ๋Šฅ ๋ฌธ์ œ๋“ค์ด ์–ด๋–ค ๊ฒƒ๋“ค์ด ์žˆ๋Š”์ง€, ํ•ด๊ฒฐํ•˜๋ผ๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ•˜๋Š”์ง€, ์ž˜ ์„ค๋ช…ํ•ด์ค€ ๊ธ€์ด ์žˆ์–ด์„œ ๊ณต์œ ํ•ฉ๋‹ˆ๋‹ค. ๐Ÿ‘‰ ํŒŒ์ดํ”„๋ผ์ธ ํ”„๋กœํŒŒ์ผ๋ง 1. Memory & CPU profiler. ์ž‘์€ ์ฝ”๋“œ ํ•˜๋‚˜๊ฐ€ ์ „์ฒด์— ํฐ ๋ณ‘๋ชฉ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Œ, ๊ทธ๋ ‡๊ธฐ์— ํ”„๋กœํŒŒ์ผ๋Ÿฌ๋ฅผ ํ†ตํ•ด ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ์˜ ๊ทธ๋ฆผ์„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์Œ. Lyft์—์„ ๋Š” Apache Beam๊ณผ ํŒŒ์ด์ฌ์„ ๋งŽ์ด ์‚ฌ์šฉํ•˜๊ธฐ์— ํ˜„์žฌ๋Š” deprecated๋œ Pyflame์„ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•จ. 2. Flink Dashboard - ๊ฐ operator์˜ CPU utilization๋„ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์ˆ˜ ์žˆ์–ด์„œ, ์–ด๋–ค ์ž‘์—…์ด ์ž ์žฌ์ ์œผ๋กœ ๋ฌธ์ œ๋ฅผ ์•ผ๊ธฐํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์Œ. 3. Flink Metric System - throughput, latency(watermark), JVM ์ง€ํ‘œ(Heap ๋“ฑ)์„ ์•Œ ์ˆ˜ ์žˆ์Œ. ๐Ÿ‘‰ ์ž์ฃผ ๋ฐœ์ƒํ•˜๋Š” ์„ฑ๋Šฅ ๋ฌธ์ œ๋“ค 1. Data skewness (hot shard) 2. Large window size 3. ๋‹ค๋ฅธ ์™ธ๋ถ€ ์„œ๋น„์Šค์™€์˜ ์—ฐ๋™ 4. ์ง๋ ฌํ™” & ์—ญ์ง๋ ฌํ™” ๐Ÿ‘‰ ์ผ๋ฐ˜์ ์ธ ๊ฐ€์ด๋“œ๋ผ์ธ๋“ค 1. ์ค‘๋ณต ์ž‘์—… ํ”ผํ•˜๊ธฐ 2. ๋ถˆํ•„์š”ํ•œ Shuffling ํ”ผํ•˜๊ธฐ 3. ํŒŒ์ด์ฌ ์‚ฌ์šฉ์‹œ Cython enableํ•˜๊ธฐ 4. ๋ถˆํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ ํŒŒ์ดํ”„๋ผ์ธ ์‹œ์ž‘๋‹จ๊ณ„์—์„œ dropํ•˜๊ธฐ 5. Protobuf ์‚ฌ์šฉํ•˜๊ธฐ 6. Data skewness ํ”ผํ•˜๊ธฐ 7. Checkpoint size๋‚˜ ๋นˆ๋„ ์กฐ์ ˆํ•˜๊ธฐ 8. ์ž‘์€ ์ƒํƒœ(State) ์‚ฌ์ด์ฆˆ 9. Network latency https://eng.lyft.com/gotchas-of-streaming-pipelines-profiling-performance-improvements-301439f46412

์•Œ๋ฆผ

์•Œ๋ฆผ์ด ์—†์Šต๋‹ˆ๋‹ค