Community

๐ŸšŒ Vector DB ๋ฒ„์Šค ํƒ€์‹ค๋ž˜์š”? Milvus!

(์•„์žฌ ๊ฐœ๊ทธ์Šค๋Ÿฌ์šด ์ œ๋ชฉ์— ๋Œ€ํ•ด ๋จผ์ € ์‚ฌ์ฃ„๋“œ๋ฆฝ๋‹ˆ๋‹ค. ๐Ÿ™‡๐Ÿปโ€โ™‚๏ธ) Vector DB ๋ž€ scalar ๋ฐ์ดํ„ฐ๊ฐ€ ์•„๋‹Œ vector embedding ์„ ์ €์žฅ/์ธ๋ฑ์Šค/์ฟผ๋ฆฌ๋ฅผ ํ•˜๋Š” DB ๋ฅผ ๋งํ•ฉ๋‹ˆ๋‹ค. embedding model ๋ฅผ ํ†ตํ•ด ํŠน์ • dimension ์œผ๋กœ ๋ฒกํ„ฐํ™” ๋œ ๊ฐ’์„ ๊ด€๋ จ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์™€ ํ•จ๊ป˜ ํ•˜๋‚˜์˜ ์—”ํ‹ฐํ‹ฐ๋กœ ์ €์žฅํ•˜๊ฒŒ ๋˜๋Š”๋ฐ์š”, ์†๋„์™€ ์ •ํ™•์„ฑ๊ฐ„์˜ trade-off ๋ฅผ ์œ„ํ•ด ์ธ๋ฑ์Šค ์ƒ์„ฑ ํ›„ Approximate Nearest Neighbor (ANN) ๊ฒ€์ƒ‰์„ ํ•˜๋Š” ํ˜•ํƒœ๋กœ ๋™์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์ตœ๊ทผ RAG ์™€ ๊ด€๋ จ๋œ ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ•˜๋ฉด์„œ Milvus ๋ผ๋Š” Vector DB ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋˜์—ˆ๋Š”๋ฐ์š”, ์ €ํฌ๊ฐ€ Milvus ๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ๋œ ์ด์œ ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. * ANN benchmark(https://ann-benchmarks.com/index.html) ์—์„œ milvus ๋Š” ์„ฑ๋Šฅ์ƒ ์ƒ์œ„ ๋žญ์ปค์ด๋‹ค. * ์ปค๋ฎค๋‹ˆํ‹ฐ๊ฐ€ ํ™œ์„ฑํ™”๋˜์–ด ์žˆ๋‹ค. * sdk ๊ฐ€ ์ž˜ ๋˜์–ด ์žˆ๋‹ค. (python, Go, Java, etc) * langchain-community ํŒจํ‚ค์ง€์— ํ†ตํ•ฉ์ด ์ž˜๋˜์–ด ์žˆ๋‹ค.(langchain ์€ LLM ๊ธฐ๋ฐ˜ ์–ดํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ตฌํ˜„์„ ์ง€์›ํ•ด์ฃผ๋Š” Framework ์ž…๋‹ˆ๋‹ค.) * k8s ํ™˜๊ฒฝ์—์„œ scale out, scale up, fail over ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. * on-disk index ๋ฅผ ํฌํ•จํ•ด์„œ ๋‹ค์–‘ํ•œ index ๋ฅผ ์ง€์›ํ•œ๋‹ค. (DiskANN ์„ ์ตœ์ดˆ๋กœ ์ง€์›) * ์ง€์†์ ์œผ๋กœ release ๋˜๋ฉด์„œ ์œ ์ง€ ๋ณด์ˆ˜๋˜๊ณ  ์žˆ๋‹ค. * ์˜คํ”ˆ์†Œ์Šค์ด๋‹ค. ๊ณต์งœ๋‹ค. * admin tool ์ธ attu ๊ฐ€ ๊ณต์‹์ ์œผ๋กœ ์˜คํ”ˆ์†Œ์Šค๋กœ ์ œ๊ณต๋˜๊ณ  ์žˆ์–ด milvus manage ๊ฐ€ ์‰ฝ๋‹ค. ํ˜„์žฌ๊นŒ์ง€ ํ”„๋กœ์ ํŠธ ์ง„ํ–‰ํ•˜๋ฉด์„œ ํŠน๋ณ„ํ•œ ๋ง์ฝ(?)์„ ์ผ์œผํ‚ค์ง€ ์•Š๊ณ  ์ค€์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์–ด ๋งค์šฐ ๋งŒ์กฑํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜น์‹œ๋‚˜ Vector DB ๋ฅผ ๊ณ ๋ฏผํ•˜๊ณ  ๊ณ„์‹  ๋ถ„๋“ค์ด ์žˆ๋‹ค๋ฉด Milvus ๋„ ํ•œ๋ฒˆ ๊ฒ€ํ† ํ•ด๋ณด์‹œ๋ฉด ์ข‹์„๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๐Ÿ“š ํ•จ๊ป˜ ๋ณด๋ฉด ์ข‹์€ ๊ธ€ * Milvus Document: https://milvus.io/docs/v2.3.x * Devocean Milvus Quick Start: https://devocean.sk.com/blog/techBoardDetail.do?ID=165368

์•Œ๋ฆผ

์•Œ๋ฆผ์ด ์—†์Šต๋‹ˆ๋‹ค