Community

๐Ÿ”ฅLLM "ํ‰๊ฐ€" ํŠœํ† ๋ฆฌ์–ผ ์—…๋ฐ์ดํŠธ ์™„๋ฃŒ! (์ด 14๊ฐœ ํŒŒ์ผ)๐Ÿ”ฅ

๊ฐ€์žฅ ๋งŽ์€ ์š”์ฒญ์ด ์žˆ์—ˆ๋˜... ํ‰๊ฐ€๋ฅผ ๋“œ๋””์–ด ์˜ค๋Š˜ ์—…๋ฐ์ดํŠธ ํ–ˆ๋‹ค๋Š” ์†Œ์‹์„ ๊ณต์œ  ๋“œ๋ฆฌ๊ฒŒ ๋˜์–ด ๋„ˆ๋ฌด๋‚˜ ๊ธฐ์ฉ๋‹ˆ๋‹ค ใ… ใ…  ๋‹ค๋ฅธ ์ฃผ์ œ ๋Œ€๋น„ ์–ด์ฉŒ๋ฉด ์ƒ์†Œํ•œ ๋ถ„์•ผ์ด๊ธฐ๋„ ํ•œ "ํ‰๊ฐ€" ๋ฅผ ์ •๋ง ์นœ์ ˆํ•˜๊ฒŒ, ๊ทธ๋ฆฌ๊ณ  ๊นŠ์ด์žˆ๊ฒŒ ๋‹ค๋ฃจ์—ˆ์Šต๋‹ˆ๋‹ค. ๊ตญ๋‚ด ์„œ์ ์ด๋‚˜ ์˜์ƒ ๊ทธ ์–ด๋””์— ๋‚ด๋†“์•„๋„ ์ž๋ถ€ํ•  ์ˆ˜ ์žˆ์„๋งŒํผ ์ •๋ง ๊ณต๋“ค์˜€์Šต๋‹ˆ๋‹ค. โญ๏ธ์ฃผ์š”๋‚ด์šฉ - RAGAS ๋ฅผ ํ™œ์šฉํ•œ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ - RAGAS ๋ฅผ ํ™œ์šฉํ•œ ํ‰๊ฐ€(context precision, recall, faithfullness, relevancy ๋“ฑ) - ํ‰๊ฐ€์šฉ ๋ฐ์ดํ„ฐ์…‹ ์—…๋กœ๋“œ(HuggingFace) - LangSmith ๋ฐ์ดํ„ฐ์…‹ ์ƒ์„ฑ - LLM-as-judge ์ผ€์ด์Šค๋ณ„(์งˆ๋ฌธ-๋‹ต๋ณ€, ๋ฌธ๋งฅ-๋‹ต๋ณ€(cot_qa), criteria(์—ฌ์„ฑ๋น„ํ•˜, ๋ฒ”์ฃ„, ์ผ๊ด€์„ฑ ๋“ฑ), labeled_criteria, ์ ์ˆ˜(scoring) - ์ž„๋ฒ ๋”ฉ ๊ฑฐ๋ฆฌ ๊ธฐ๋ฐ˜ ๋‹ต๋ณ€ ํ‰๊ฐ€ - Custom LLM ํ‰๊ฐ€ - ํœด๋ฆฌ์Šคํ‹ฑ ํ‰๊ฐ€(Rouge, BLEU, METEOR, SemScore) - ์‹คํ—˜ ๋น„๊ต - Summary ํ‰๊ฐ€ - ํ• ๋ฃจ์‹œ๋„ค์ด์…˜(Groundedness) ํ‰๊ฐ€ - ์‹คํ—˜ ๋Œ€์กฐ(Pairwise) ํ‰๊ฐ€ - ๋ฐ˜๋ณต ํ‰๊ฐ€ - ์˜จ๋ผ์ธ ํ‰๊ฐ€๋ฅผ ํ™œ์šฉํ•œ ํ‰๊ฐ€ ์ž๋™ํ™” โœ… ํŠœํ† ๋ฆฌ์–ผ ๋งํฌ: https://wikidocs.net/259208 ๐Ÿ’ป ๊นƒํ—™ ์†Œ์Šค์ฝ”๋“œ: https://github.com/teddylee777/langchain-kr/tree/main/16-Evaluations ์ด๋ฒˆ์— ์ž‘์—…ํ•˜๋ฉด์„œ ์ •๋ง "ํ‰๊ฐ€"์˜ ์žฌ๋ฏธ์— ํ‘น ๋น ์กŒ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ ์ฆ๊ฑฐ์šด ํ•œ ์ฃผ ๋˜์„ธ์š”! ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. #ragas #rag #evaluation #langsmith

์•Œ๋ฆผ

์•Œ๋ฆผ์ด ์—†์Šต๋‹ˆ๋‹ค