Hoifung Poon’s Post

General Manager, Microsoft Health Futures

4w Edited

Excited to attend #CVPR 2024 tomorrow (Summit 324 14:20-15:00) and talk about the research frontier in multimodal #GenerativeAI for precision health: https://lnkd.in/gheD5Svn. The confluence of digital transformation and GenAI revolution opens unprecedented opportunities for optimizing patient care and accelerating biomed discovery, but challenges abound in the forward path. Generic frontier models such as GPT-4 are amazingly proficient in understanding biomedical text (e.g., MedPrompt https://lnkd.in/gmgWPiS3), but they exhibit major competency gaps in other modalities such as medical imaging and multi-omics. In this talk, I'll present some learning in bridging such competency gaps: BiomedCLIP: For any modality, its study typically involves natural languages, which means that modality-text pairs are often abundantly available. We explore using publicly available data for vision-language pretraining (15 million PMC image-caption pairs): https://lnkd.in/ghF55fyU. Excited to see many subsequent amazing works, e.g., Twitter (PLIP, James Zou), Youtube (QUILT, Linda Shapiro), PMC/textbook (CONCH, Faisal Mahmood). LLaVA-Med: Standard contrastive learning treats all modalities equally. We instead explore using text as the interlingua modality and focus on learning an adapter for "translating" into the text semantic space, which is very data efficient. To train a multimodal GenAI copilot, we leverage GPT4 to synthesize instruction-following data from available image-text pairs. LLaVA-Med also features a modular design with late-fusion, with plug-and-play modality-specific encoders/decoders, thus can easily scale to general use cases (e.g., combining X-ray, CT, MRI, digital pathology, multi-omics). As a PoC, LLaVA-Med 1.0 was trained using just a tiny fraction of BiomedCLIP data: https://lnkd.in/deNH4XkR. We have since substantially improved the model and just release LLaVA-Med 1.5: https://lnkd.in/gndEMbC9. We are also exploring specialization such as LLaVA-Rad https://lnkd.in/g7cCZqse. Excited to see many subsequent amazing works, e.g., MAIRA (Javier Alvarez Valle), PRISM (SIQI LIU, Kristen Severson), PathChat (Faisal Mahmood). Multimodal generation: The LLaVA-Med framework can incorporate modality-specific decoders for generating multimodal output, e.g., BiomedJourney https://lnkd.in/gFnvqFTw, BiomedParse https://lnkd.in/gzFPe5aH (Mu Wei will present in the same workshop 10:40-11:20). Whole-slide modeling: Digital pathology poses unique computational challenges due to its enormous size. We propose the first whole-slide pathology foundation model, GigaPath, in our #Nature paper https://lnkd.in/gHjmT7We. So much more remains to be done!

Multimodal Generative AI: the Next Frontier in Precision Health - Microsoft Research

https://www.microsoft.com/en-us/research

9 Comments

Hoifung Poon

General Manager, Microsoft Health Futures

We are fortunate to work with many amazing collaborators, such as Carlo Bifulco, Brian Piening, Sheng Wang, Muhao Chen, Jianfeng Gao, Tao Qin, Furu Wei, Mu Wei, Hany Awadalla, to name just a few. There are many other exciting works in health AI by amazing teams at Microsoft, e.g., Biomedical Imaging (Javier Alvarez Valle), Biomedical Signal Processing (Michael Hansen), Biomed ML (Nicolo Fusi), ...... Please check out the whole workshop: Foundation Models for Medical Vision https://fmv-cvpr24workshop.github.io/#about. There are an amazing array of great speakers: Shekoofeh Azizi, Sharon Xiaolei Huang, Mu Wei, Faisal Mahmood, David Ouyang, MD. Thanks Bo Wang and all for the invite and organizing! Look forward to seeing many old friends and meeting new ones. If you want to check out the brand new #Microsoft campus at Redmond, happy to be the tour guide if schedule aligns :)

7 Reactions

James Weinstein

Global VP Access/Equity Microsoft

Hoifung and his team are leading the way in AI supported technolgies while being ever minded of the complexities associated with real world medicine.Ecosystem transformation will only occur with technology substitution when integrated to systems that work to improve outcomes

1 Reaction

Oded Kalev

CTO and Co-Founder @ Converge Bio Some men see things as they are and ask, "Why?" I dream things that never were and ask, "Why not?"

Amazing results, congratulations!

2 Reactions

Bridger Ammar, PhD 🌍🕊️

CEO @ higg.world, Affiliate Professor@ University of Washington, Investor - Opinions my own.

Looking forward to this! Thanks for the heads up Hoifung Poon.

1 Reaction

Sarah Gordon

Operations Program Manager at Microsoft

Amazing! Love to see this!

1 Reaction

Stephen Ibaraki

Global Chairman REDDS Capital, Microsoft 22 Global Awards (7 Awards, 2018-2025 in AI), Investor/Venture Capitalist, Futurist, Serial Entrepreneur, Founder & Chair Outreach UN ITU AI For Good, Author, 300+ recognitions

Hoifung Poon 💯🌎

1 Reaction

Md Mostafijur Rahman

Doctoral Candidate at UT Austin | AI in Healthcare | NIH IRTA Fellow | Ex - Machine Learning Research Intern at Bosch Research, USA

Great work!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Hoifung Poon

General Manager, Microsoft Health Futures
1w
Report this post
Excited to attend the Mayo AI Summit July 8-9: https://ai-summit.com/ Privileged to join esteemed speakers John Halamka, M.D., M.S., Thomas Fuchs, Anant Madabhushi, Shauna M. Overgaard, Ph.D.. Will be sharing some of our latest progress and lessons in multimodal #GenAI for precision health. Look forward to learning from everyone!

Home

ai-summit.com

2 Comments
Like Comment
To view or add a comment, sign in
Hoifung Poon

General Manager, Microsoft Health Futures
2w
Report this post
For those who didn't make the workshop but are interested, see below for our talks and relevant materials: Multimodal GenAI in Precision Health: https://lnkd.in/g9gFWMdD BiomedParse (by Mu Wei): https://lnkd.in/gNTzvnnp GigaPath: Nature paper: https://lnkd.in/gHjmT7We Hugging Face: https://lnkd.in/gpGcjrsc GitHub: https://lnkd.in/gPjpu4mJ Blog: https://lnkd.in/g4fYGmbp BiomedParse: Paper: https://lnkd.in/gZJDtiar Demo: https://lnkd.in/gzFPe5aH

Hoifung Poon

General Manager, Microsoft Health Futures
4w Edited

Excited to attend #CVPR 2024 tomorrow (Summit 324 14:20-15:00) and talk about the research frontier in multimodal #GenerativeAI for precision health: https://lnkd.in/gheD5Svn. The confluence of digital transformation and GenAI revolution opens unprecedented opportunities for optimizing patient care and accelerating biomed discovery, but challenges abound in the forward path. Generic frontier models such as GPT-4 are amazingly proficient in understanding biomedical text (e.g., MedPrompt https://lnkd.in/gmgWPiS3), but they exhibit major competency gaps in other modalities such as medical imaging and multi-omics. In this talk, I'll present some learning in bridging such competency gaps: BiomedCLIP: For any modality, its study typically involves natural languages, which means that modality-text pairs are often abundantly available. We explore using publicly available data for vision-language pretraining (15 million PMC image-caption pairs): https://lnkd.in/ghF55fyU. Excited to see many subsequent amazing works, e.g., Twitter (PLIP, James Zou), Youtube (QUILT, Linda Shapiro), PMC/textbook (CONCH, Faisal Mahmood). LLaVA-Med: Standard contrastive learning treats all modalities equally. We instead explore using text as the interlingua modality and focus on learning an adapter for "translating" into the text semantic space, which is very data efficient. To train a multimodal GenAI copilot, we leverage GPT4 to synthesize instruction-following data from available image-text pairs. LLaVA-Med also features a modular design with late-fusion, with plug-and-play modality-specific encoders/decoders, thus can easily scale to general use cases (e.g., combining X-ray, CT, MRI, digital pathology, multi-omics). As a PoC, LLaVA-Med 1.0 was trained using just a tiny fraction of BiomedCLIP data: https://lnkd.in/deNH4XkR. We have since substantially improved the model and just release LLaVA-Med 1.5: https://lnkd.in/gndEMbC9. We are also exploring specialization such as LLaVA-Rad https://lnkd.in/g7cCZqse. Excited to see many subsequent amazing works, e.g., MAIRA (Javier Alvarez Valle), PRISM (SIQI LIU, Kristen Severson), PathChat (Faisal Mahmood). Multimodal generation: The LLaVA-Med framework can incorporate modality-specific decoders for generating multimodal output, e.g., BiomedJourney https://lnkd.in/gFnvqFTw, BiomedParse https://lnkd.in/gzFPe5aH (Mu Wei will present in the same workshop 10:40-11:20). Whole-slide modeling: Digital pathology poses unique computational challenges due to its enormous size. We propose the first whole-slide pathology foundation model, GigaPath, in our #Nature paper https://lnkd.in/gHjmT7We. So much more remains to be done!

Multimodal Generative AI: the Next Frontier in Precision Health - Microsoft Research

https://www.microsoft.com/en-us/research

3 Comments
Like Comment
To view or add a comment, sign in
Hoifung Poon

General Manager, Microsoft Health Futures
3w
Report this post
Super fun day!
Peter Lee

President, Microsoft Research
3w

I think we may be approaching a tipping point in AI for medical imaging. Incredible progress at the @CVPR workshop on foundation models for medical imaging. Our teams were pleased to share work @MSFTResearch on #Biomeparse and #GigaPath to standing-room only audience.
Like Comment
To view or add a comment, sign in
Hoifung Poon

General Manager, Microsoft Health Futures
4w
Report this post
What a fun chat! Thanks David Sontag for moderating a lively discussion and all the fellow panelists for stimulating view points (Robert Ball, Brian Caffo, Subha Madhavan, Anthony Philippakis). Really enjoy all the questions from the audience and subsequent sidebar conversations! All big thanks to the AIPM organizing committee for the invite. Demissie Alemayehu, Kannan Natarajan, Emre Kıcıman
Subha Madhavan

Accelerating data-driven insights and digital experiences in Life Sciences & Healthcare
4w Edited

It was my great pleasure to participate in the AIPM (https://lnkd.in/ehxWt4ih) panel on “Generative AI in Drug Development” live event at the Interdisciplinary Science and Engineering Complex at Northeastern University last week where I had the opportunity to share more about Pfizer's efforts to leverage innovations in human-led AI/ML to enhance R&D productivity. I joined a dynamic panel with FDA’s Robert Ball, Microsoft’s Hoifung Poon, Johns Hopkins School of Public Health's Brian Caffo and Google Venture’s Anthony Philippakis, with expert moderation by David Sontag of MIT & Layer Health where we discussed applications of Large Language Models for FDA’s Sentinel program, foundation models for pathology related multi-modal data analysis, automating regulatory corpus with #GenerativeAI for pharmaceutical product submissions, where tried-and-true classical AI methods may be better suited, and using generative AI to advance precision medicine in Cardiology. Thank you to the AIPM organizing committee and my fellow panelists for the valuable discussion. Photo Credit: Demissie Alemayehu
Like Comment
To view or add a comment, sign in

7,167 followers

119 Posts

View Profile Follow

Hoifung Poon’s Post

More Relevant Posts

Explore topics