LLaMA: Open and Efficient Foundation Language ModelsHugo Touvron, Thibaut Lavril, Gautier Izacard et al.|arXiv (Cornell University)|2023 We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.
Affordance-Compiled Intelligence: Observable-Only Cognitive Impedance Matching for No-Meta LLM-Integrated SystemsPatrick Lewis, Ethan Perez, Aleksandara Piktus et al.|arXiv (Cornell University)|2025 Affordance-Compiled Intelligence develops Cognitive Impedance Matching Theory (CIMT), an observable-only and no-meta protected compiler theory for LLM-integrated systems. The paper studies how a fixed model-policy can exhibit different operational capability when the surrounding world is redesigned through observations, typed action handles, validators, repair paths, rollback modes, authority scopes, context summaries, and auditable receipts. CIMT treats system-level capability amplification as a world-side compilation problem rather than a model-weight improvement problem. It defines operational claims through explicit claim objects and evidence objects, using committed observable ledgers, target-evaluation channels, deterministic reducers, validity budget ledgers, evidence dependency graphs, artifact I/O manifests, conformance envelopes, and finite-sample or sequential certificates. Human reviewers, LLM judges, benchmarks, and external auditors are not treated as privileged evaluators; they are modeled as named, fallible measurement channels. The theory provides a conservative certification framework for paired target-channel improvement, vector debt accounting, forbidden-coordinate zero certificates, target-firewall discipline, scope simulation, dynamic widening, runtime and model-policy conformance, macro reliability, repair contraction, distribution-shift transfer, and receipt sufficiency. It also includes worked examples for code-editing agents and retrieval-augmented generation systems. The intended contribution is a practical formal foundation for making fixed-model LLM systems more reliable through observable world-side interface, authority, validation, repair, and audit design.
Llama 2: Open Foundation and Fine-Tuned Chat ModelsHugo Touvron, Louis Martin, Kevin H. Stone et al.|arXiv (Cornell University)|2023 In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs.
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and ComprehensionMike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov, Luke Zettlemoyer. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.
Multilingual Denoising Pre-training for Neural Machine TranslationYinhan Liu, Jiatao Gu, Naman Goyal et al.|Transactions of the Association for Computational Linguistics|2020 This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks. We present mBART—a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora in many languages using the BART objective (Lewis et al., 2019 ). mBART is the first method for pre-training a complete sequence-to-sequence model by denoising full texts in multiple languages, whereas previous approaches have focused only on the encoder, decoder, or reconstructing parts of the text. Pre-training a complete model allows it to be directly fine-tuned for supervised (both sentence-level and document-level) and unsupervised machine translation, with no task- specific modifications. We demonstrate that adding mBART initialization produces performance gains in all but the highest-resource settings, including up to 12 BLEU points for low resource MT and over 5 BLEU points for many document-level and unsupervised models. We also show that it enables transfer to language pairs with no bi-text or that were not in the pre-training corpus, and present extensive analysis of which factors contribute the most to effective pre-training. 1