2024년 6월 5일
Guiding a Diffusion Model with a Bad Version of Itself
(Tero Karras, Miika Aittala, Tuomas Kynkäänniemi, Jaakko Lehtinen, Timo Aila, Samuli Laine)
The primary axes of interest in image-generating diffusion models are image quality, the amount of variation in the results, and how well the results align with a given condition, e.g., a class label or a text prompt. The popular classifier-free guidance approach uses an unconditional model to guide a conditional model, leading to simultaneously better prompt alignment and higher-quality images at the cost of reduced variation. These effects seem inherently entangled, and thus hard to control. We make the surprising observation that it is possible to obtain disentangled control over image quality without compromising the amount of variation by guiding generation using a smaller, less-trained version of the model itself rather than an unconditional model. This leads to significant improvements in ImageNet generation, setting record FIDs of 1.01 for 64x64 and 1.25 for 512x512, using publicly available networks. Furthermore, the method is also applicable to unconditional diffusion models, drastically improving their quality.
Karras 선생님이 한 건 더 하셨군요. Classifier-free Guidance가 퀄리티와 프롬프트에 대한 정렬을 높이면서 다양성을 감소시키는데, Guidance를 위한 모델로 Unconditional 모델이 아니라 Condition이 주어진 동일한 모델, 그렇지만 좀 더 약한 모델, 즉 더 작고 학습이 덜 된 모델을 사용하는 것으로 퀄리티를 높이면서 다양성의 감소도 억제할 수 있다는 결과입니다.
#diffusion
To Believe or Not to Believe Your LLM
(Yasin Abbasi Yadkori, Ilja Kuzborskij, András György, Csaba Szepesvári)
We explore uncertainty quantification in large language models (LLMs), with the goal to identify when uncertainty in responses given a query is large. We simultaneously consider both epistemic and aleatoric uncertainties, where the former comes from the lack of knowledge about the ground truth (such as about facts or the language), and the latter comes from irreducible randomness (such as multiple possible answers). In particular, we derive an information-theoretic metric that allows to reliably detect when only epistemic uncertainty is large, in which case the output of the model is unreliable. This condition can be computed based solely on the output of the model obtained simply by some special iterative prompting based on the previous responses. Such quantification, for instance, allows to detect hallucinations (cases when epistemic uncertainty is high) in both single- and multi-answer responses. This is in contrast to many standard uncertainty quantification strategies (such as thresholding the log-likelihood of a response) where hallucinations in the multi-answer case cannot be detected. We conduct a series of experiments which demonstrate the advantage of our formulation. Further, our investigations shed some light on how the probabilities assigned to a given output by an LLM can be amplified by iterative prompting, which might be of independent interest.
Epistemic Uncertainty의 측면에서 할루시네이션에 접근. 간단하게 요약하면 질문을 주고 답변을 생성합니다. 이 생성한 답변을 프롬프트에 더해서 다시 답변을 생성합니다. 이렇게 쭉 답변을 생성했을 때 답변이 이전 답변에 의존하지 않는다면 Ground Truth이고 답변이 이전 답변에 영향을 크게 받는다면 할루시네이션일 가능성이 높다는 것입니다.
실제 할루시네이션 검증은 위의 방법으로 생성한 답변 분포의 Mutual Information을 측정하는 방식으로 접근했습니다.
#hallucination
Process-Driven Autoformalization in Lean 4
(Jianqiao Lu, Zhengying Liu, Yingjia Wan, Yinya Huang, Haiming Wang, Zhicheng Yang, Jing Tang, Zhijiang Guo)
Autoformalization, the conversion of natural language mathematics into formal languages, offers significant potential for advancing mathematical reasoning. However, existing efforts are limited to formal languages with substantial online corpora and struggle to keep pace with rapidly evolving languages like Lean 4. To bridge this gap, we propose a new benchmark Formalization for Lean 4 (FORML4) designed to evaluate the autoformalization capabilities of large language models (LLMs). This benchmark encompasses a comprehensive assessment of questions, answers, formal statements, and proofs. Additionally, we introduce a Process-Supervised Verifier (PSV) model that leverages the precise feedback from Lean 4 compilers to enhance autoformalization. Our experiments demonstrate that the PSV method improves autoformalization, enabling higher accuracy using less filtered training data. Furthermore, when fine-tuned with data containing detailed process information, PSV can leverage the data more effectively, leading to more significant improvements in autoformalization for Lean 4. Our dataset and code are available at https://github.com/rookie-joe/PDA.
Lean을 타겟으로 한 Autoformalization 연구. 최근 나왔던 Lean을 사용한 증명 모델 구축의 첫 단계에 해당하는 부분이기도 합니다. (https://arxiv.org/abs/2405.14333)
Lean 컴파일러의 출력 결과를 사용해 Outcome/Process Supervision을 확보하는 방식이네요. 앞으로 LLM + 수학 연구를 하려면 Lean을 공부해야 하지 않을까 싶기도 합니다.
#math #rlaif #llm