Learning or Searching: Foundations Models and Information Retrieval in Digital Pathology
Presenter, Adam Shepard: So, hi everyone and welcome to the 15th TIA Center Seminar of the Academic Year. So, my name is Adam Shepherd, and I'm a post doc here at TIA Center. For the people online that are new to our seminars, we aim to invite researchers from across the globe to present new and exciting work. Before we get started, I just like to remind everyone that a few of us members from the TIA Center, [unintelligible] 2024 conference in Manchester, titled Recent Advances in Computational Pathology. The full deadline is today, and it's a really great opportunity to submit and, hopefully, present some work. If you have any questions, feel free to ask. [unintelligible] would like to give a quick introduction.
Presenter 2: Hi, so I'm really honored to have Dr. Tizhoosh to be our TIA Seminar Speaker today. Dr. Tizhoosh has been at the forefront of developments in the area of computational pathology and really fortunate that he accepted our invitation. So, thank you very much, I'm really looking forward to your talk. I think Adam will do the formal introduction of yourself, but I just want to say I'm really grateful for you to accept our invitation and to take time out of your busy schedule to inform us about all the exciting things.
Just a little bit more. So, thank you very much for for joining us. So, for everyone online I'd like to, in person, I'd like to introduce Professor Hamid Tizhoosh. Professor Hamid Tizhoosh, is a Professor of Biomedical Informatics in the department of AI and Informatics at the Mayo Clinic.
From 2001 to 2021 he was a professor in the faculty of engineering at the University of Waterloo, where he founded the Kimia Lab. Before he joined the, before that he joined the university, sorry, before he joined the University of Waterloo, he was a Research Associate at the Knowledge and Intelligence Systems Laboratory at the University of Toronto, where he worked in AI methods, such as, reinforcement learning. And, since 1993, his research activities encompass AI, computer vision and medical imaging. He's developed algorithms for medical image filtering, segmentation and search. He's the author of two books, 14 book chapters and more than 140 journal and conference papers.
The title of our talk today is Foundation Models in Histopathology. So, again, thank you very much for joining us, and, when you're ready to get started.
H.R. Tizhoosh, Ph.D., Professor of Biomedical Informatics, Mayo Clinic: We can start?
Presenter: Yeah, please thank you.
Dr. Tizhoosh: Thank you very much for the kind introduction, I appreciate that. I'm grateful for the opportunity. I look at this as, a, just, reporting back to the community. And, hopefully, we can share some high-level findings and abstract, maybe, philosophies and which direction we have to move, and what we have to do. And, hopefully, we get some questions and some
feedback.
The title is more or less the same thing. The question is about Foundation models and Information Retrieval in Digital Pathology and what, what, what does it look like at the moment, and maybe where we should go. So, the point of departure for us is observer variability in medicine. So, which seems to be the source of almost all problems. So, whatever we touch, triaging, diagnosis, treatment planning, seems to be subject of variability. So, you ask multiple physicians, multiple experts, you get different responses. And, of course, for us, perhaps, in digital pathology, which is the diagnostic gold standard for many diseases, looking at whole-slide images and then coming up with a diagnosis or subtyping or grading is one of the most important aspects of variability.
And, there are many, many reports coming, E1, that shows the six breast pathologist look at different cases. 100 cases of invasive stage two carcinomas. And, you get an interobserver variability of 0.5 to 0.8, and more scary intraobserver variability of 0.76, which, is always, is always interesting. So,that if, if you have a proper wash out between the first and second observation, that, why even intraobserver variability is that high?
It's scary from a patient's perspective, it's scary. And, of course, again, there are many, many, and you can use, Kappa Statistics. Or, you can use something relatively new in medicine, Krippendorf's Alpha, to look at variability, and you measure it. So, here, a little bit larger 149 consecutive DCIS cases, 39 pathologists. And, you see the Krippendorff's Alpha. You have, you can talk about consensus if you have at least 66%, that, that's a measure coming from Communication Theory, people have started using it also in medicine in recent years.
And, you see all of them are below 66%, which means if you look at this type of, and of course, when we report this, it's not about the solution, we are, just, people just talking about the problem, how bad the problem is. And, cytopathology is not an exception, cytopathology is suffering from the same thing and in a different way. Because, for example, in cytopathology we may be counting stuff and measuring things more explicitly, quantitatively, than diagnostic histopathology.
So, and, if... the question is, can AI remove variability? I, I don't know what, what question we can ask more important than this? Doesn't matter what AI does, is it segmentation, is it identification, is it searching, is it providing embeddings, is it detection? Whatever that is, if the ultimate goal is not removing variability, I, I personally would have problem with saying why we are doing this? If AI cannot help us to remove the observer variability, or eliminate, or reduce, drastically, at least.
So, this is something that I show in many of my presentations. That, okay, so if I give a piece of whole-slide image and you, you may classify to get rid of variability. Which is, you are saying yes and no, malignant yes and no. You may come up with a stage of grade and you may provide a probability of likelihood that, that tissue sample belongs to a certain class. So, the question is, what is it that we are saying if, when we use classification? What is it that we are saying?
We are saying many physicians, or all physicians have to accept what the machine says, that's what we are saying. So, will that happen? Well, if, if the community at large trusts AI, then that may happen. So, with, with regular deep networks that's not going to happen. Most likely, if you have one or multiple trustable Foundation models that could happen eventually, as, as part of conversational AI. So, but if I go another route, and instead of classification, I search and I find similar patients, and bring back also the annotated information reports, patient data, everything else. What is it that we are saying in that case?
In that case we are saying that one physician has to accept what, what many other physicians have said, that's evidence-based. Because, beyond imaging and molecular data, the other source of evidence that we have, is the cases that we have already diagnosed and treated. Evidently, and, we know it was free from variability, free from error, that's a statistical evidence. So, what we are accepting here, what we are expecting here, is that the physician accepts what many other physicians have done and said.
That's more likely to happen. That's the fundamental difference between Retrieval and Foundation models. So, Foundation model has to convince us through knowledge, and knowledgeable smart conversation. Whereas, Retrieval convinces us through retrieving evidence. Which, of course, could be much more, much easier, if you can really find the relevant information.
So, if you look at General Comparison, because Foundation models, or classification, basically. Deep down, or come from that corner, and search and retrieval. So they, they, classification, historically, not so much Foundation model, now have been based on supervision. Whereas, search, it was based on unsupervised learning. Now, both of them are shifting to self-supervision, which is great, fantastic. And, the strength of classification has been high accuracy, so you, if you train something. Of course, naturally, you get high accuracy.
The initial papers that came out from AI community in histopathology, they, they reported, I don't know, 98%, 99% accuracy. I mean, you can look at it, and see there are easy cases. The, the, the tumor is obvious, even any young pathologist could do that. But, it was the beginning. So, it's, it's acceptable what we were doing. Whereas, unsupervised search has generally lower accuracy but is agnostic, agnostic to disease, and operates on small, both a small and large data set. So, the classification is usually difficult to explain, and cannot generalize easily, and it needs a lot of label data. Again, we are talking about classical before we move to self-supervision and things like that.
So, needs, and research needs, expressive embeddings, which has to come from somewhere. So, search and retrieval historically do not deal with feature extraction. At the beginning years, we have just dealt with raw data, no feature extraction. But, there is also a lot of information to interpret. I put it in weakness, but it could be the strength, the same way as low accuracy. It could be a weakness because it's cautious, it could be a strength because it's cautious, it's very conservative. So, it's, it's not easy to put this to classification and search in, in balance, and compare to each other. So, but, if you go make the transition to the Foundation models...
So, we know that, basically, there are deep models that are general purpose. So, they are not designed for specific task and they are supposed to be adaptable to a large number of tasks. And, they are trained on massive data sets. So, and, usually, is colossal amount of unlabeled data. And, here the trouble starts. Because, massive data set. Who has massive data set? Well, not many
hospitals, not many health care systems have massive data sets. And, if they do, they are not accessible for many reasons. They, not, all of it, is digital. We have hetrogeneous archives and repositories they are not anonymized, they are not accessible, easily. So, it's a problem, that in the public domain is not there. But, in the medicine we have to deal with it.
They are based on self-supervision: so it's not unsupervised, it's not supervised. We go towards self-supervision and we look for finding patterns and correlations to be able to generalize. And, they can be fine-tuned for specific tasks. So, for downstream tasks. However, if you have something that you say, that's a foundation model for histopathology, this is what is that means. Your general
domain in histopathology, the expectations shift. It's not like I bring clip that has been trained with cats and dogs and airplanes and bicycles, and then the zero shot learning is not expected
to do much. But, if you have a foundation model that is supposed to know histology and histopathology from the GetGo, the expectation will be different.
So, a point before that. So, we, we moved from regular models to to Foundation models, or extremely large models. And, it seems we wanted to get, basically, rid of overfitting problem. So, we had, if you don't have much data underfitting, if you have the right size you do fit, and if you, if your model is too large, you overfit. And, then, we said, you know what, give me the all the data, I make the the model big, and then we don't have that problem. Well, great, fantastic. But, if you make the network extremely large, you get other set of new problems, which is hallucination.
So, you make things up, which is uncontrolled, it cannot be controlled. Uncontrolled, following through the trajectory of input, output relationships in a gigantic model. So, you cannot really, we we, just basically postpone the problem and now we are patching. We, now, we are patching left and right to make it, to make it work, such that we do not hallucinate. You do not see the same enthusiasm that journals and publishers display for publishing results on foundation models.
When you go to granting agencies, I don't know about Europe. In United States, granting agencies,
especially NIH, is very skeptical with respect to what foundation models can do, and what dangers they may have, among others, because of hallucination. So, okay. So, we know that, for example, we know that the WSI, if you have some sort of unsupervised clustering based approach, you may come up on average of 80 patches per whole-slide images representation. I will get to it that you may say, okay, I will go through all of them, but, it's not feasible.
And, technologies like CLIP, they use 400 million image-caption pairs. So, that means, actually, if you want to do something comparable, you need, roughly... Because, you, you are not doing doing cats and giraffes and airplanes, we are just coming to one domain. So, 400 million, let's say divided by 80, roughly, so you're talking at least 5 million whole-slide images. But, not just whole-slide images, we need the reports, we need social determinants, we need the lab data, we need radiology, we need genomics, and so on....
So, that's, that's a monumental, data management project. It's not, and, and now you start to understand why people are experimenting with Twitter and PubMed, and so on... Because, nobody has this. Including us, nobody has it. Well, you may have it, but you cannot operate on it, you cannot really train something on it. That means, practically, you don't have it. So, it's a monumental task. Which, even if you do it as a single hospital, it will be of limited use. Even if you train it like us, with anticipated six to seven million patients. Still, the population diversity will be low, so you need initiatives of multiple hospitals, multiple countries, probably, to do that.
So, of course, the the main thing about Foundation model is not a new topology, we are using the same workhorse, the transformers, and, is just about the sheer size. And, what happens of developing and forming a linear, subnetworks inside the network, beyond that critical scale? That there are some theoretical, fantastic, theoretical works pointing to that. So, what,...
From our perspective, when we talk about Foundation models... It is all about, if I give you a patch, while you're not there yet, exactly, to use the entire whole slide image. You may get a gigantic patch, maybe 7,000 by 7,000 pixel, maybe. That's, 8,000 by 8,000 pixel is the largest that I have personally tried to put to a GPU. But, we cannot put, at the moment, we cannot put the entire whole-slide image through the GPU. So, patches, going through the network, getting some embedding.
So, so, when is a foundation model a foundation model? Well, we expect two things from a foundation, from a model that claims to be a foundation model in histopathology. Again, it has been trained for histopathology. Two things:
Zero-Shot Learning: So, you have to be able to classify never-seen data. If you cannot do it then you cannot say, look, so you have to fine-tune me. No. That means then you are a regular network, don't say that you are a foundation model.
And, the Quality of Embeddings: This is the toughest thing you can do, in my experience. The toughest test for a foundation, for a network that claims to be a foundation model, is to check the quality of embeddings. So, get features, get embeddings and use it for retrieval, because it's unsupervised. You do not touch anything, you do not, just use it to see... Did it capture the
histological clues, the anatomic clues in the image, just, without any fine tuning, without anything?
So, if it is a foundation model that has seen histology and histopathology, it is expected that
to do that. Because, fine-tuning, that means I want to do a specific task, but I am just interested in quality of embedding.
Okay, so we want to do that for search, and I have been one, as one of the, one of the colleagues who has advocated that search really is intelligence. And, this is not something new, because search goes back to the roots of AI, back in the 50s. The logic-based search was the beginning of AI. Search for proofs of mathematical theorems. We had GPS, which was one of the biggest claims of AI back in the 50s and 60s, to come up with something that can solve all problems. Doesn't ring a bell?
Foundation models. They claim, okay, we can do it all. We are going back to the GPS in a, in a little bit more cautious way, but we are saying, basically, if you give me gigantic amount of data, I can solve all problems. So, we are, we are now going back to the reason of the first AI winter and saying, now we have a solution for that.
A* Search Algorithm: A* Star Search Algorithm in the 60s, is finding the shortest path for optimal solution in a graph.
Alpha-Beta Pruning: For game trees. Expert system that been probably the reason for the second AI winter. And, now, sort of, without talking about it, we are reviving them too. We are putting rules in place to prune/edit the response of, of foundation models. It will develop in that direction, combined with retrieval, isn't that a new way of expert system? Probably, it is.
So, the Renaissance of Information Retrieval is happening. So, the biggest example for that is Retrieval-Augmented Generation (RAG). So, we have the LLMs. They are impressing everybody by telling jokes and writing poems and all that. So, trained on massive amount of data, they are really good at human-quality text, most of the time. And, but, they have problems with factual accuracy and they have problems with the staying up-to-date. So, and you cannot retrain and fine tune a foundation model every two weeks. It costs a lot of efforts.
And, External Knowledge Source. It, you, need External Knowledge Sources, let's say, like Wikipedia and PubMed, if you're working in public domain, to take information and supplement the LLM's knowledge. So people are doing that. This has been going on for some time. People realize, immediately, that the knowledgeable conversation that can get out of the hand, because of those unpredictable trajectories, in correlation trajectories, inside the transformer. You can bound it and control it with retrieval, which is accessing evidence in, in your domain. So, you can, nobody can question that, because this is the evidence, this is the historic data.
So, you can prompt a RAG platform. That means you retrieve, you search for the knowledge, and you find relevant information. And, then you combine that retrieve information, and prompt it, feed it back to LLM to text to generate more reliable text. So, that means now the LLM can, the foundation models can base their response in more factual information, reduce level of hallucination, and get
rid of obsolete information by augmenting through retrieval. Fantastic news for people like me who are, want to stick with, also retrieval.
Don't give up retrieval, because retrieval is a foundational technology in, in computer science. We don't want to get rid of that. We, we knew that we needed, and specifically in medicine, information retrieval is accessing the general wisdom, the medical wisdom, the evidence from the past. We cannot get rid of that.
So, the advantages of Retrieval-Augmented Generation (RAG), of course, and, and accuracy increase reliability and trust. More transparency through Source Attribution, which deep networks cannot do. They cannot tell, ask Chat GPT and GP4 and LLAMA 2 and any of them. So, where did you see that? Can you tell me a source? Well, unless you connect it to an information retrieval system, they cannot attribute a source to what they are saying. Which, is, for us in medicine, a, a fundamental requirement to back things up. Because somebody has to take the responsibility for the diagnostic
reports that the pathologist is writing.
What about the Generation-Augmented Retrieval (GAR)? Can we do that? So, can we generate, and I show if I'm, if I'm, if I can get to that, I show you a very simple example of that. That you retrieve and then you, you, you use generation to add value to the retrieval. So, not much work has been done in that domain, but it's definitely coming. Many people will realize that.
So, how do we search? Well, the primary thing for us in digital pathology is whole-slide images. So, but, so requirements for creating index datasets in medicine? In general, is you need high-quality clinical data. That, that's, that's playing with PubMed, and Twitter, and even online repositories doesn't cut it. You need high-quality clinical data. So, it has to be diverse population, it has to be multimodal, we are still mainly on whole-slide image, a little bit of text. You need fast search algorithm when we get there.
At the moment we really don't need fast search algorithm because nobody has the 10 million patient database to search in it in a multimodal way. But, of course, we have to plan for it. We have to be prepared, we have to have algorithm when the, the repository is opened up. Low demand for indexing storage, something that every single paper that I have read, including our own, has ignored it. So, you cannot ask that for indexing at the archive, you additionally, you need 20% of the volume, just for index the data (especially the whole-slide image). Who should pay for it? People, hospitals cannot pay for it.
I understand that this is not a sexy topic for researchers who just want to focus on the theoretical side and say, my algorithm is fast and accurate, but is it lean for storage or not? If it is not, that's a huge problem. It will not be ad adopted in the practice, it has to be robust. Many techniques that we test, tested, they failed many, many times. This cannot happen in the clinical workflow. And, they have to be user, have a user-friendly interface. Not much has been done in that regard as well.
So, if you go back to the Gigapixel whole-slide image, most of the time we just patch them. And, then we, we get a selection of that, it has to be a Divide. And, then you Conquer it by putting the patches through some sort of network, you get some deep features. And, then you have to Combine it,
by somehow aggregating or encoding those deep features and you have to do this every time. So, but, the Divide has to have some conditions, because you cannot do random patch selection, it will not be reliable.
We cannot do sub-setting instead of path selection, because it's supervised, needs annotated data. Then you are doing, you can do it for specific cases, but it will not be a general purpose approach. And, what most people, most papers are doing, they process all patches. That's excessive memory requirement, it will make it slow. And, the, the only argument that I hear from researcher is, okay, buy some GPU subscriptions on the cloud. Well, really? Who should pay for it, who?
So, maybe, my employer can pay for it, maybe your employee can pay for it, but can a small clinic in a remote village in Congo pay for it? So, we are all talking about democratization and making AI accessible. I cannot, I cannot take that statement seriously, if you do not pay attention that processing whole-slide images is very computationally intensive. You cannot process all patches, you have to make a selection and process those, such that the memory requirements are lean.
So, the Divide of whole-slide image has to be universal. It has to be unsupervised. It has to process all tissue sizes and all shapes of a specimen. It has to be diagnostically inclusive, it cannot miss any relevant part of the tissue. It has to be fast, yes, which many papers have published just on this aspect of it, but it has to have, also show storage efficiency. Which means, you have to extract a minimum number of patches and encode them in a very efficient way, such that we can we can save them. It, it cannot be less than, more than 1% of the whole slide image. We cannot add, the overhead cannot be much more than that, and that makes it really difficult.
So, then, the rest is relatively easy. The whole side image comes, you patch it, you send it to, to some network and you get the features, you add it with the metadata, you can start matching and searching with some other detail. So, when you send a queried whole-slide image to an image search engine, you go in, you retrieve the similar ones, you retrieve the associated metadata. Here, then, LLMs can come and help, Foundation models can come and help. And, the main thing is that the pathologist who is doing that, and the retrieve cases. The retrieve cases come, actually, from the other Pathologists that are not there.
So, this is, this is a sort of virtual peer review that can enable us to, basically, do computational second opinion. Consensus building. So, and, that's extremely valuable in medicine. So, if he can build computational consensus, and, say, that's a second opinion based on historic data. And, we can do it in a, in an efficient and fast way, reliable way. Perhaps, we can do something. So, we...
Presenter: Someone asked a question in the chats, in regards to the division process. So, is there a risk, that by dividing the WSI into packages, that you lose some important information?
Dr. Tizhoosh: If, if we do patching, we lose information?
Presenter: Yes, so the process of...
Dr. Tizhoosh: You, you should not, that's the challenge. You should do unsupervised patching. You get the relevant information and you should not lose information. But, you should not do it with the 2,000 patches that that WSI has, but do it with 50. And, that's the
challenge. That's why people don't do it and they just do multiple instance learning and put everything in bags and makes it is much easier.
Presenter: Yeah, I think the person was just referring to potentially losing any context by taking specific patches.
Dr. Tizhoosh: If, if you don't want to do it in a supervised way, we don't have any other choice. And, that's very difficult to do, to do it in unsupervised, that's the challenge. We have spent some time to develop new ones, such that we can go in and make sure that on, from pattern perspective, because if it is unsupervised you don't know what is what. And, then you may also grab fat, you may grab a normal epithelial tissue. So, that's the challenge. Do it unsupervised. Don't miss anything that is diagnostically relevant, that's the challenge.
And, I agree, it's not easy. And, we have not done large scale validation to see if it is reliable, if it is doable or not. So, we looked at multiple search engines and some of them are new, some of them are not. And, things that we see, for example, some of them, for example, SMILY by Google, didn't have a Divide. So, and, I asked the colleague who was presenting that work in pathology vision, and he was saying, okay, just just subscribe to the cloud and he was saying that with a smile. Well, I know, I understand the, I understand the, the business model but that's not doable for the hospital.
And, another point here is when we look at validating the search engines. Some of them use post-processing or reranking, which is a problematic thing. If your search engine is failing, you cannot just patch it by adding ranking, ranking after the fact. So, you may post-process for visualization, you may even post-process for a little bit more accuracy, but, if your search, fundamentally, has failed and then you try to compensate with additional ranking, you are not doing search, you are doing classification.
So, we tested that with both internal and public data sets. The paper is under review, hopefully will come soon. A copy is on the archive. So, we used roughly 2,500 patients, 200 patches, 30, some
38 subtypes, just to test, but, we looked at multiple things. We looked at top one accuracy, majority three accuracy, majority five accuracy, and we looked at [unintelligible] score, not precision. Again, some papers just use precision, because they were focused on classification.
So, we looked at this, the time for indexing, we looked at the time for search, we looked at how many times the approach failed and we looked at the storage requirement. And, then, we ranked them based on that. So, we did not calculate any additional number, we just look how well they did for each one of them, and these are the three major ones that we look at.
The Bag of Visual Words is still is one of my favorites, but it has been ignored in the recent list literature. And, is not good for accuracy, but actually in speed and storage, can beat everybody else. That's interesting.
So, Yottixel, which is, I have been involved in that, seems to be a really good one, but even that is not a good choice for moving forward. So, although Yottixel and KimiaNet were the best overall performers, and here is when the KimiaNet or any other network DenseNet, deficient net, can be replaced with a good foundation model, if one is available. And, we tested that with other things. We compared it with, with, with PLIP, among others, which completely failed compared to KimiaNet. A CNN, not a foundation model, a regular small network. All, all, search engines that we have tested, all of them, they have low level of accuracy, so they cannot be used for clinical utility, so, that, that was a major thing.
So, and, most of them do not look at speed and storage at the same time. So, they, they just focus on speed, that seems speed to be more sexy from academic perspective. And, storage, who cares? Just buy some storage. Well, if that's the way, then who cares about the speed? Just buy some GPU. We cannot just get rid of the requirement by saying that. We do not have automated whole-slide image
selection. Just think about it. I, we, went back and said, who gave us this 600 cases? Our pathologist. And, then, I went back and asked him, how did you select this?
Well, we did some search with text. It's basically quasi random. So, what happens if you have 1 million whole-slide images? Do you want to index all of them, is it necessary? If you have Squamous Cell Carcinoma, maybe. If you have Renal Cell Carcinoma, probably not, because it's so distinct, you don't have much redundancy. Why should I index all of them? So, we do not have any model, any algorithm, for automated whole-slide image selection. The, everybody sets magnification on patches randomly, empirically. I do 1,000 by 1,000, we do 200 by 200, that one does 500 by 500, 20x, 40x, nobody knows what it is, probably, it's depending on primary side and primary diagnosis. But, we have not dealt with it because we were busy with other stuff.
And, most importantly, no multimodal search. We do not have any multimodal search in Histopathology. What happens if I have images, and I have text, and I have RNA sequencing, and I have a radiology image at the same time, I want to search. So, that, that's that's the type of Information Retrieval that we need. So, if I look at retrieval and Foundation model, so what is what is the difference, or what are the common points? So, most of the time, so retrieval works, really, with small models, small data sets, and the computational footprint is small. Whereas Foundation models, everything is gigantic, so that's a major thing that we have to keep in mind.
So, Information Retrieval convinces through evidence, whereas Foundation model convinces through knowledgeable conversation. Definitely, this should be combined, there is no question about that. RAG was a major step in that direction. Information Retrieval can deal with rare cases, even if you have two, three cases. WHO has blue books, you have prototypical one case, one whole-slide image. Well, Foundation models usually are perceived to be more for common diseases where you have many cases. You can tweak them, you can tweak them such that they can process also complex cases, rare cases, but they are not meant for that.
But, most importantly search is explicit Information Retrieval where our Foundation models are implicit Information Retrieval. We have to realize that, we are looking at the same thing from two different perspectives, and if you realize that then we can go back and design things in a more intelligent way.
The source attribution in Information Retrieval is visible, is accessible, is explainable. Whereas, in Foundation models, it's not visible, it's not accessible, is not easily explainable. Again, RAG is a good decision in, in the right, in the right direction.
Maintenance. Of course, Information Retrieval is, is much more, well, well posed, because low dependency on hardware updates, you can add and delete cases relatively easily, a bit, depending on the indexing algorithm. New models can be replaced to old models relatively easily. But, with Foundation model you cannot do that, it's heavily depending on hardware updates, high efforts for prompting to customize, expensive re-training cycles may be necessary.
Foundation models may be for really big institutions and big companies and big corporations. Information Retrieval could be for the small guys, for the small clinics. But, again, there is no question that combining them, going back and forth, will be really, to exploit the possibilities.
So, again, on the, on the left you see a typical search and retrieval. Baxically, you have a table, you have a lookup table. Long tissue comes, there is some sort of function, and it gives you the address of the correct diagnosis, it says lung adenocarcinoma, that's what, the search works. If, if, you may have an explicit hash function, or not, but you have a function. But, on the right you have a network, and it does the same thing. Long tissue comes in, through a complex trajectory, and the network is the function. It gives you, again, the lung adenocarcinoma, does the same thing, implicit Information Retrieval. So, we have to realize that and make sure that we just use these two things interchangeably, or in tandem, best.
So, we have been working, I don't know how much time I have, do I have time? I, I don't see my clock here. So, so...
Presenter: We still got about 20 minutes left
Dr. Tizhoosh: So, okay, so we we are working on what we call Mayo Atlas. So, we built an atlas, for us is an structured index collection of patient data, well-curated, that represents the spectrum of disease diversity. And, the index, which is: we need patient data representation, it has to be semantically, biologically, anatomically, clinically and genetically
correct, reflecting correct pattern similarities. So, atlas is overloaded term, it has been in use for almost centuries. So, you have to clarify what what you say. It's a repository, index repository.
So, if you have many patients, you have, you add the first modality. For us, primary modality is whole-slide image. You index it, you add the index. Second modality, pathology reports, you add the index. Third modality, fourth modality, you add gene expressions, x-ray images, and so on and so on. And, then you add the output, which is amazing aspect of Information Retrieval. You can at any time replace or update the output. Networks, deep networks cannot do that, because you are training with them.
So, and then you can, you can, and another point is that any modality may be missing. And, depending on your indexing, you still can infer knowledge. Even if the X-ray is not there, or gene expression is not there. You should be able to just go in with incomplete information, infer new knowledge for the new patient.
Atlas Requirements
So, what should an atlas, which is a index repository for intelligent Information Retrieval,
characteristic should show?
Inclusion - It should be inclusive. An atlas should contain all manifestation of that disease. So, if you're talking about lung cancer, you should have all representative cases, which is not easy. And, that's why we said, okay, look, we need an automated way of selecting whole-slide images. Manual inspection, visual inspection cannot do it.
Veracity - The atlas must be from free from variability. You cannot just complain about AI, that AI makes mistakes, physicians make mistakes too. And, then we put it aside and nobody knows that the case that we did two years ago was a mistake, was a variability mistake. So, you, I, have to double check, you have to double check things that you do in the atlas.
Semantic Equivalence - So, the indexing must conform to anatomic and biologic nature of the disease, which is quality of embedding. Here deep models and Foundation models can be very helpful. So, if you come up with a multimodal indexing, which is, okay, you go through some sort of network, you get some embeddings, and you do what we call associative learning. Associative learning, and then you put them together with one index, then you can start doing and building an atlas of disease.
Whole-slide images come in, molecular data comes in, clinical data comes in, go through some certain Network. You do your association learning, which is, which part of those embeddings is a common point? When I see this tissue, then I see this gene expression, then I see this point in the X-ray image. And, then you can do search and matching, you, you can provide the top matches, provide computational second opinion, and then the pathologist can make the decision, write the the final
report. Says assistive, in nature. And, definitely, this should be combined with the power of large language models.
So, we, for, for sake of public display and public demos, we chose a TCGA one, because we don't have the clearance yet to show internal data. Tissue, whole-slide image, and RNA sequencing from TCGA. Look at just the tissue image. That's the accuracy that you get with simple matching. You bring in RNA sequencing, it gets a lot better, of course, in a case, that's a relatively easy case for RNA sequencing. And, then you combine tissue and RNA sequencing, and you would say, naturally, you expect that the accuracy increases. Of course it does.
That may be different from different primary sides, of course. And, long, I, when I showed the demo to colleagues, I got criticized, you have chosen an easy one. I know, because we don't want to be too tough at the beginning of developing a, a new system.
So, when the query patient comes, and you find the first match, and the second match, and third match, and you retrieve the information. You aggregate the metadata through a large language model. That's why I called the generation, Augmented Retrieval. So, you find three reports, you combine them with an LLM to one report to describe. So, that's autocaptioning in a very different way, retrieval-based autocaptioning, basically. Then, you can do region matching, between the query and matches. So, we can visualize your evidence, why you are saying this patient is this and that.
You can visualize the atlas by just using t-SNE and UMAP, and any other technology, to show your patients, and where is your patient position. You combine them all into a computational second opinion report. The vision for us is to, to, basically, make that accessible. I don't know when that will happen. So, we, we have to do that Foundation model. So, we connect the Atlas, the retrieval, to that Foundation mode. Is it two years from now? I don't know. If, if, we get the money, if I get the money, I will push it to we get it in the prototype in a year. But, I'm guessing it will be more than a year. So, nobody has experience with crunching numbers with six million whole-slide images, so it, it, it will be tough.
So, these are the people who have contributed to that. I don't know how much time we have? Do we have time for a short demo or we are already over?
Presenter: Yeah, sure, we've got about 15 minutes lef, so we still got time.
Dr. Tizhoosh: So, here is a very short demo. Do you see that, Welcome to Mayo Atlas?
Presenter: Yeah, we can see that.
Dr. Tizhoosh: Okay, okay so if I sign in, and I choose my simple, easy, lung cancer atlas from TCGA. So, let me see, I can find the demo folder,
use my easy demo image, and then I have some description related to that. It doesn't need to be the report, because for a new patient you don't have the report, and I choose the gene expression. So, there are some maybe questions. Whole-slide image, some question or description. The initial visual inspection, and then gene expression.
It will be difficult to upload RNA sequencing, so you have to process it, get something out of it, it will be tough. And, then you start uploading. The upload is not here. Completely realistic, it's buffering, a lot of stuff for the sake of demo, so, it's not completely real time. And, then the results are there, this is very simple prototype. So, that was the, the query data, is a pathology slide. So, we have three matches, and you can, you can, you can select the third one and say, oh, let me look at that. You can go in and compare whatever you want, and see, make sure that it is, really convince yourself, that it is correct. And, it could be top three, top five, top seven, whatever.
After you have convinced yourself, which is, you have to do it. You see that, here, we have reports, also notes, attached. That's the evidence. So, if I see full notes. So, that's the first match, second match, third match... So, these are the results. And, then, if I show at the summary. The summary is... This part is the part that I called it Generation Augmented Retrieval. So, you get the reports of the top three matches, and you combine them into one report to, basically, autocaption the image that you found through the retrieval. There is a lot information that you can do, so we say, okay, generate a report for me.
For example, I can look at the t-SNE and say, if I look at the the whole-slide image on WSI you see it's really clearcut. Again, this is an easy case. And, these are the patients for Lung Squamous Carcinoma, Adenocarcinoma, and here is my patient. So, it gives really nice overview that this is the population in the atlas, this is my patient. And, I can select that, and I can save it. Go back, there's a lot of things that we can change, and I can say generate a report. So, and this is my patient, number 123, and so this is the summary, the majority vote says this is Lung Squamous Carcinoma.
We have a gen, gene expression heat map that looks at the top 20 High variance ones. And, we also provide a sample query. This is the query, the first match you can look at. So, everything can be put in the in the report. And, they say, okay, save and exit. And, it's, if not, it doesn't crash, it generates a report for you. And, say, okay, we looked at top three, look at top five, this is Lung Squamous Cell Carcinoma, this is what the generation of retrieve cases put us, and these are the visualization, of, with respect to individual modalities, and these are some sample cases. And, that's, that's in a nutshell what, what, basically, the atlas can do, combined with Foundation models. Which, at the moment is very weak connection for us. We have to do a lot more work to make that bridge stronger.
So, thank you so much, I hope, I hope this was helpful.
Presenter: Well, thank you very much, Professor. That was a really interesting talk. So, with that, should we open the floor to questions? I noticed a couple of questions on the, on the chat. So, I think I'll start with those then.
Question 1: So it says, For multimodel retrieval, what do you do in the case of mixing modalities?
Dr. Tizhoosh: Oh, yeah. So, if, if you get individual embeddings and you don't aggregate them. If you aggregate them is good for saving and storage and speed, but if you don't aggregate them, you just encode them, or compress them, and you just concatenate them. That gives you the freedom to if, if one of them is missing, you know which one is missing, you just compare the others. The downside is, that your, your patient index will be much longer than it should be. So, you have an embedding for image, you have an embedding for RNA sequencing, you have an embedding for reports, and so on and so on. So, at the moment, we are, we see no other way. You have to keep the embeddings of individual modalities separately. You may encode them, compress them, and that gives you the freedom to, to, basically, search separately.
Question 2: Thank you, and a second question here, was also, How do you suggest dealing with rare cases in a retrieval based Paradigm?
Dr. Tizhoosh: So, in rare cases is it's relatively easy, and, hopefully we can
soon publish some results. Because, you may, you may have as little as one rare case among others. So, you have many common cases and you have several rare cases. So, again, big condition if you're
embedding. If you're indexing has done its job and your embedding is high quality and you get the expression, so, the rare embedding could be, should be, part of the top end retrieval.
In our experience, if we have the suspicion that is a rare, complex case, you, you cannot
rely on majority vote among the top end retrievals. You cannot, it's too risky. You, you may come up with, what we call the histogram of possibilities. And, you have to say, look, it, it and, especially, this is one of those things that you have to provide when you deliver the top end search results. That, among the top end, although the majority says this, let's say the majority says that's [unintelligible] carcinoma.
But, you have one case, let's say, you have one case which is not the top search, among the top five, it could be the fifth one, let's say. So, which is adenosis or is papillary carcinoma, some of the rare cases for breast. So, you have to display that and you cannot rely on majority. So, it, it's one of the challenges that retrieval has to do, but you get the information. That's the minimum, provided that your embeddings are good, big condition, of course. And, that's where everybody is waiting for good foundation models to give us good embeddings.
Presenter: Thank you.
[unintelligible]
Question 3: You showed an example where you were building the autocaption from retrieval, from generation instead. How would you, how would you rely on that kind of generation from top k cases if it was a rare case? That that was really the essence of my question.
Dr. Tizhoosh: It seems, it seems that at Mayo, to my knowledge, many of our physicians are using LLMs to summarize diagnostic reports. When, or reports about patients, when patients come, and they come with 30, 40 pages of information. And, it, this is a regular, challenging, clinical workflow that somebody has to read all of it and say, What, what has been done for the patient? What what type of treatment? What was the history?
And, one of the things that is very low risk in using LLMs, is to use them for summarization. Because they are not generating anything, they are just summarizing, at least we think there is. I
have not seen large-scale validation but the general perception is that summarization is more reliable. So, you can, if you give three reports that are with, with the condition that retrieval has done its job, are describing the same. And, what we did, the only thing that we did, we looked at top five. And, say, majority was three and it was a squamous. So, we do not provide the other two reports because the majority vote says it's a squamous. So, we provide three reports for Squamous Cell Carcinoma and say, Combine and Summarize, and in our limited experience, no large scale validation, it looks good.
Question 4: Again, if it's a rare case then all top three might not be relevant, that, that's the whole point, right? By definition, all of the top three cases might be totally irrelevant?
Dr. Tizhoosh: Could be, the, all top three could be what, sorry?
Question 4 continued: They might be irrelevant?
Dr. Tizhoosh: Could be. Yes, yeah, then the search has failed, then the search has failed, yes.
[unintelligible]
Dr. Tizhoosh: If, if you have, if you have a really small data set that can happen, if you have a really a small atlas. If you, for large atlases what we do, we combine the
top three and top five, by the way we cannot show top 50, because who should process that? If we show the top three we can get some additional supporting evidence, statistical evidence. By looking at top 50, top 100, just as confidence measure, we can do that for medium size and large repositories.
To say the top three says, it is this, and when I look at top 100 the top 100 says this as well. So, again, with exception of rare complex cases, this may work. But, one of the things is that, this and many other aspects of information retrieval, as a community, we have not paid enough attention, so we have to, we still have to do a lot of work.
Audience Member: Great, thank you very much for a fascinating talk.
Presenter: A question from an audience member.
Question 5: Yes sir, thank you very much for the really interesting talk, I really enjoyed it. Sure, just, sorry, just as I can, want to turn on my camera, so you see
who's here watching. Thank you very much, again. So, I've been following your work, and so the question that I have. I noticed that the one of the first part of the retrieval algorithm, which is the sampling of patches, it's really important and it affects the efficiency of the algorithm, largely as you mentioned. If I remember correctly, what it does, the Yottixel algorithm that you're currently using, is based on the similarity of patches.
Basically, selects the patches that's a higher number of cells in it, which is good, especially in can, most of the cancer types. But, I was wondering, if [unintelligible] is also a good case, it's also a good algorithm or a good method then, then, then what what you looking for is, doesn't have many cells in it. You say, say you want to retrieve a [unintelligible] sample, right? Which doesn't have many cells in it. And, basically, what Yottixel does, that doesn't sample much of [unintelligible] matches as it doesn't samples fat, right? So, was wondering if there is any, if there, there, if it is problematic? And, if it is, it will be any workaround for this problem?
Dr. Tizhoosh: Well, Yottixel doesn't do that, KimiaNet does it. KimiaNet to, to be, to train on both diagnostic slides and frozen sections of TCGA. It made the assumption that we will just grab and process those ones with high cellularity, making the latent assumption that this is all about carcinoma and mo, maybe a little bit of inflammation, but not any other. And, this is perfect, legitimate assumption for TCGA, it's all carcinomas. Yottixel does not make that
assumption.
Although, as I said, Yottixel, like any other. Although, Yottixel, to compare to the others, is really good, is also not good enough because, among others, it, it used the DenseNet or we use the KimiaNet for it. The embeddings are not good enough, and it has also some other parameter setting that others have taken over, and it makes it less practical. So, if you make any Assumption of that sort, in search, you will be limited, you cannot do that. So, you cannot make high cellularity assumption. And, that's why, probably, if you apply KimiaNet on non, on lymphoma, let's say, it, it may fail to really give you good embeddings.
And, any other method, that has made that assumption for the search and retrieval, the embeddings that you get have to come from somewhere, and that do not make any anatomic histologic assumption about the image. Or, so, what we know is this, it seems we will not solve the accuracy problem of image retrieval. And, again, we, we, it was it was, it was very eye openening to me, to myself as well. Methods, search methods, that claimed to have been designed for rare cases, provided 17% accuracy for breast, rare cases, 17%.
So, which means, okay, this is not a solution at all. So, it seems the only way that we have, maybe only two way, is either you have a super, fantastic foundation model, which I don't see it. Well, I mean the colleagues from from Mount Sinei, Dr Fox group, they did with 1 million, but is not publicly available, so, I would love to test that. So, or, you take a regular one, small one, CNN one, small transformer and fine-tune it for every Atlas, even if it has 200 patient. Which, is challenging, even if you use self-supervision. So, but, the computer science tell us, no [unintelligible] theorem.
You cannot just be good, unless you customize. So, maybe, we have to fine tune for every Atlas, for every organ, for every primary site, but, which is a challenge if it is a small. And, everybody assumes we have a gigantic data set, but we don't, nobody does. To my knowledge nobody, no hospital, has that access, relatively accessible. So, let's work and focus on small data sets. I mean, but most likely you will run into the problem that journals will not publish your papers because they want to see millions and millions. But, okay, so if I want to do million I have to go online and use online data, but hospitals don't have it yet.
Question 6: Another question. I guess you're right, though I'm also wondering about any [unintelligible] sampling, because it seems that sampling is really playing, is really playing an important role here. But, in the same vein, so was wondering, you might have come across any idea like this. As you were presenting your work, was wondering if we can use image retrieval as an approach to, to label, to, together, label data semi data set, or I don't know what to call it, self, self-supervised labeled data set?
Say, say you just provide one patch of DCIS. Or, or any specific type of tissue to, to, the, this image retrieval and ask it to retrieve similar patterns from million whole-slide images. That will give you a really, really good data set to train another model in a supervised manner. So, I was wondering if anyone has looked into this kind of application of image retrieval or would you think it's a good idea to do that?
Dr. Tizhoosh: No, I, not to my knowledge, not in this way. The, the problem is in image retrieval, like any other field, you have to stick around. You cannot, you cannot come in, publish one paper, and have the illusion that you have solved the problem and go away. You have to stick around, invest, fail, develop solutions and test and realize they don't work. There are many other problems.
I give you another problem that has not been solved. Every patient comes with multiple whole-slide images. We are, at the moment, latently, assuming that they have only one whole-slide image. No, they don't. So, as long as you cannot process, that patient comes with seven, eight, ten, twelve, fifteen whole-slide images, and then patient representation and patching becomes even more complicated. Now you have seven whole-slide images, eight whole-slide images from one patient, and you have to select patches in a way that you do not miss anything and you do not overload your selection with normal tissue, which can misguide the search and retrieval, very difficult to do.
We have just started looking at that. Information retrieval in histopathology, digital pathology, very complicated. I, I don't think we have even started to do that, we have been working on the
surface.
Presenter: Great. We've come to 3:00, but are we okay to just ask more questions for five minutes?
Question 7: Can I ask a quick question now? [unintelligible] Thank you for the nice talk. I really enjoyed your paper, What is the foundation model, as well, and some of the other critiques you've been publishing, so, so appreciate that, that role of yours to the community.
Dr. Tizhoosh: I'm not making many friends with that.
Question 7 continued: Well, in science it's, it's not my my opinion or yours
that matters. We just need to validate whatever we do with with the results and that's what you are after. So, I really appreciate that, so thank you for that. Another thing, I, I wanted to like ask you, or get your thoughts on, is one requirement for foundation models is transferability. In one of the slides that you showed you mentioned that there are, like, zero-shot learning, for example, is a way of measuring transferability of a foundation model. I was just wondering if you had measured the transferability of this search-based Foundation model approach that you are proposing for classification?
Dr. Tizhoosh: Very good point, no, we haven't. And, I, I probably, I won't touch that until we really have access to the... We have been preparing, since last August, to, to access our data on the cloud, which is at the moment around, short, probably of 7 million whole-slide images. It's not 7 million patients, but 7 million whole-slide images. By end of the year we'll be approaching 9 million. So I, and I, I want to do tests with that. I want to see if we go in and then we have rare cases, we have common cases. And, the questions that Dr [unintelligible] would ask, so what happens if the top search results are irrelevant? Which means something is fundamentally going wrong. And, if you have to establish the base, the patching is okay, the embeddings are okay, the speed is okay, the storage is okay.
Now we go to bigger questions. Okay, so can I transfer knowledge? Can I do so? Because if none of
that is working, I'm not saying we will wait, really, to solve all those problems perfectly. No, no, no. But, we need a reasonable base in order to go after those sophisticated new bridges that we can say, okay, can I go from here to here? Can I do zero-shot classification or not? We have not, so we
have not, and the reason that I don't do it, even with small or medium-sized data set, is probably, is just psychological, because we are excited to just get prepared to go operate on the cloud data.
Question 8: Okay, makes sense. The other comment I had, is, is one distinguishing feature of, of Foundation models that I see emerging in the field. Not only in computional pathology but in general, is that, let's say we've got chat GPT, by itself we never taught it to reason, for example. And it may not be able to reason perfectly, but it does show more promise than just being the next word predictor. So, it has learned this additional human-like capability of showing, somewhat, indications of being able to reason.
And, I was wondering, if, if when we talk about Foundation models in a specific domain, for example, computational pathology. Whether, it is relevant to make domain specific Foundation models or would it be more useful to build on top of these more general purpose Foundation models, like chat GPT, and then adapt them for a certain domain? Or, not even if like. For example, if you just do retrieval augmented generation using Chat GPT embeddings in some way, what are your thoughts on that? Like, should we invent a New Foundation model for computational pathology going to that domain or should we wait until the these multimodal models that are being proposed by the industry, more or less, more so than Academia at least, just wait for them?
Dr. Tizhoosh: I was, I was reading a very interesting theoretical work from a young, fellow scientist, recently joined MIT. That, he put the theory forward that, beyond a certain scale of data, large models start developing smaller, linear models inside. And, that's among other, is the theory, responsible for those new capabilities. At the moment, I'm thinking that, since there are a lot of things that we don't know about deep models, large models, it's very risky, and, regulatory bodies, like FDA, will be suspicious of that.
If you take, let's say ChatGPT, and just fine-tune it for histopathology, I won't do that. That's why I have not been trying to experiment with anything else. I want to wait, get my hands on 6 million, and then I have the reports. I have x-ray, I have RNA, I have everything, I have multimodal, so six million should be enough. And, the histopathology is a small enough domain, compared to ChatGPT, which covering literature and politics and science and everything, is many domains, so, general purpose. When you come up with a foundation model for histopathology, it's not general domain, so, I don't understand the term general purpose histopathology. What do you mean general purpose histopathology? It's histopathology, are you giving me a specific network?
I don't know what is happening inside of ChatGPT, and others, that I take it and fine-tune it with my gold-value, clinical data. I would rather train something from scratch, and it doesn't need to be that that big, because our field is really special. Again, and we have to do that, computer science, no free lunch, you have to specialize, you have to customize, otherwise you, you won't be accurate
enough. But, since we don't know what is happening inside, unless for specific tasks, again like the summarization of text, okay. Just getting some embedding for images, okay. But, as a conversational partner, and people have started to use conversational information retrieval term as well, that comes into the clinical workflow, at the moment, I won't do that.
I don't trust enough what is happening inside that we can use it. And, what the conversation that we have with that model, will be attached to the diagnostic report, to the treatment planning. I'm looking at that. That may be a too too conservative view, but, but I want to build something that is reliable, and we tested two three years, and then we can really start using it. And, it's worth it
to wait for it, get your hands on really high quality clinical data, and do it from scratch. Doesn't, doesn't prevent us from using fine-tuned one for less sensitive tasks.
Question 8 continued: Okay, well thank you so much, interesting perspective, nice, nice meeting you.
Dr. Tizhoosh: Thank you, nice meeting you.
Presenter: Brilliant. Well, have we got any more questions from anyone? Sure, do do we have time for one final question? Professor Hamid is that okay?
Dr. Tizhoosh: Yeah, I, I, I'm, I'm available. So, I'm assuming we are running out of time, and, it's, everythings okay, we got some questions. Thank you.
Presenter: So, okay, so what was your question?
Question 9: Very interesting talk. Just one question. You mention about expressive embedding in in one of your slide. I just want to know, how will you define that expressive embedding? Because, and the second thing is, that when you are generating the report, based on your retrieval. So, one thing that I noticed, is you also mention about inter and intra observer variability. So, normally the reports are generated by the pathologist, so how will you
handle that thing that last when you are generating final report from LLM?
Creating an atlas, the second condition was veracity, that the atlas is free from variability. The only thing that remains is the, is, basically, the, the diversity in the language that the
pathologist may say the same things with different words, and large language models should be able to deal with that. But, we are assuming, and we have to make sure that the veracity is there, which means every case that we have in the atlas has to be double verified, even though it's a historic data, it was done two years ago, five years ago, we have to check it again when we put it in the atlas. And, that's another reason that an atlas cannot be millions of cases because we cannot do that for millions ions of cases.
I forgot what was the first question?
Question 9 continued: Yeah, the first was about expressive embedding. How will you define that expressive embedding, because, yeah...
Dr. Tizhoosh: Our simple, simple approach is this, if embeddings are good, if deep features are good, we do not push the other parts of information retrieval. We just go for a simple comparison, very simple comparison, we just use Euclidean Distance. Don't use any sophisticated hashing, compression, variance, analysis, nothing. Principal component, nothing. Just take the embedding, compare them, and if they are good enough, it should give us reliable accuracy. And, most of the time, almost all search engines failed. All of them failed, including Yottixel. All of them failed. They can't. They, we, don't have good embeddings, yet.
And, that's why you're trying to, say, okay, grab a small one and grab a small model, and fine-tune it for your small Atlas. That's, that's the, that's the approach, until we get good, some good foundation models.
Question 9 continued: Okay, thank you.
Dr. Tizhoosh: Thank you.
Presenter: Fantastic. Well, if you don't have any more questions then I think we'll we'll draw the seminar to close. But, thank you very much, that's been a really interesting talk, and thank you for your time and spending extra, I know we've gone over about 10-15 minutes.
Dr. Tizhoosh: Thank you very much. Thank you, I appreciate the opportunity.
Presenter: And, I just want to thank everyone online for joining the meeting, everyone in person. Just want to remind everyone again, our next seminar is next Monday and we're joined in person this time by from radb UNC Netherlands so, hopefully see everyone then. But, again, thank you very much Professor Hamid, and thanks online.