William Chan

Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

Chitwan Saharia, William Chan, Saurabh Saxena et al.|arXiv (Cornell University)|2022

Cited by 2.1kOpen Access

We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation. Our key discovery is that generic large language models (e.g. T5), pretrained on text-only corpora, are surprisingly effective at encoding text for image synthesis: increasing the size of the language model in Imagen boosts both sample fidelity and image-text alignment much more than increasing the size of the image diffusion model. Imagen achieves a new state-of-the-art FID score of 7.27 on the COCO dataset, without ever training on COCO, and human raters find Imagen samples to be on par with the COCO data itself in image-text alignment. To assess text-to-image models in greater depth, we introduce DrawBench, a comprehensive and challenging benchmark for text-to-image models. With DrawBench, we compare Imagen with recent methods including VQ-GAN+CLIP, Latent Diffusion Models, and DALL-E 2, and find that human raters prefer Imagen over other models in side-by-side comparisons, both in terms of sample quality and image-text alignment. See https://imagen.research.google/ for an overview of the results.

Imagen Video: High Definition Video Generation with Diffusion Models

Jonathan Ho, William Chan, Chitwan Saharia et al.|arXiv (Cornell University)|2022

Cited by 346Open Access

We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. We describe how we scale up the system as a high definition text-to-video model including design decisions such as the choice of fully-convolutional temporal and spatial super-resolution models at certain resolutions, and the choice of the v-parameterization of diffusion models. In addition, we confirm and transfer findings from previous work on diffusion-based image generation to the video generation setting. Finally, we apply progressive distillation to our video models with classifier-free guidance for fast, high quality sampling. We find Imagen Video not only capable of generating videos of high fidelity, but also having a high degree of controllability and world knowledge, including the ability to generate diverse videos and text animations in various artistic styles and with 3D object understanding. See https://imagen.research.google/video/ for samples.

Haemodynamic and Hormone Responses to Acute and Chronic Frusemide Therapy in Congestive Heart Failure

Hamid Ikram, William Chan, Eric A. Espiner et al.|Clinical Science|1980

Cited by 174

1. Since important interrelationships between haemodynamic and hormone indices are possible in cardiac failure, measurements of cardiac output, mean pulmonary artery pressure, plasma renin activity, angiotensin II and aldosterone were carried out before and during acute and chronic frusemide therapy in patients with oedematous heart failure who had been given digoxin. 2. Cardiac output fell significantly 90 min after acute frusemide infection, then returned to baseline. Mean pulmonary artery pressure declined steadily throughout the 4 h of observation. 3. These haemodynamic changes occurred in the absence of major hormonal fluctuations and related presumably to direct vascular and diuretic actions of frusemide. 4. With more chronic (8-10 days) oral frusemide therapy, reciprocal changes between haemodynamic and hormone indices were observed. As the diuretic response to frusemide diminished, cardiac output and pulmonary artery pressure declined whereas the renin-angiotensin system was activated. Statistically significant inverse correlations were observed between these haemodynamic and hormone indices. 5. In both acute and chronic phases of the study, fluctuations in aldosterone levels were regulated by the renin-angiotensin system whereas ACTH, plasma potassium and plasma sodium played, at best, supportive roles.

Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

Su Wang, Chitwan Saharia, Ceslee Montgomery et al.|Unknown|2023

Cited by 142

Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a cascaded diffusion model built, by fine-tuning Imagen [36] on text-guided image inpainting. Imagen Editor's edits are faithful to the text prompts, which is accomplished by using object detectors to propose inpainting masks during training. In addition, Imagen Editor captures fine details in the input image by conditioning the cascaded pipeline on the original high resolution image. To improve qualitative and quantitative evaluation, we introduce EditBench, a systematic benchmark for text-guided image inpainting. EditBench evaluates inpainting edits on natural and generated images exploring objects, attributes, and scenes. Through extensive human evaluation on EditBench, we find that object-masking during training leads to across-the-board improvements in text-image alignment – such that Imagen Editor is preferred over DALL-E 2 [31] and Stable Diffusion [33] – and, as a cohort, these models are better at object-rendering than text-rendering, and handle material/color/size attributes better than count/shape attributes.

Acute Left Ventricular Remodeling Following Myocardial Infarction

William Chan, Stephen J. Duffy, David A. White et al.|JACC. Cardiovascular imaging|2012

Cited by 122

Is this you? Claim your profile.

Top publicationsby citations