The National Cancer Institute released a groundbreaking new study demonstrating the ability of an AI algorithm to detect cervical cancer or precancer in seconds, from a single image of the cervix. The automated visual evaluation algorithm holds the promise to permanently alter cervical cancer screening protocols. In this article we will explore AVE and its implications for women and healthcare providers around the world.
What is Automated Visual Examination (AVE)?
Automated visual evaluation (AVE) is a machine-learning based algorithm that assesses digital images of the cervix for signs of cancer or precancer. The algorithm evaluates a single image of the cervix and then gives either an AVE positive result indicating the presence of problematic lesions that are either indicative of cancer or that increase the likelihood of cancer developing in the near future, or AVE negative, indicating a cervix that is not at increased risk for cancer.
In the study of the National Cancer Institute (NCI), automated visual evaluation, was able to detect signs of cancer or precancer with over 90% accuracy, a significantly higher rate of accuracy than the current Pap cytology test that is the standard of care around the world, whose accuracy was pegged on the same population at 71%. AVE even outperformed interpretation of the same images by expert clinicians.
In the NCI study, AVE was found to have a sensitivity of 97.7% and a specificity of 85% in women of reproductive age.
Why is AVE important?
What the NCI study demonstrated is that automated visual evaluation has the potential to offer a much more accurate and cost-effective method of cervical cancer screening.
Today, there are multiple issues facing current cervical cancer screening methods, and those issues lead to a greater than necessary burden of morbidity and mortality for women. The introduction of cytology for the Pap test has reduced instances of cervical cancer by 70% in high-resource settings, but the test is frequently inaccurate with up to 35% of cases producing a false negative result. To improve accuracy, HPV testing has been introduced both as a cotest with Pap and in some areas as a primary screen in place of cytology. Although HPV testing has greater sensitivity than Pap, there is a tendency to overdiagnose since not all HPV positive patients develop precancer or cancer.
Both Pap and HPV testing rely on physical specimens collected from the patient and sent for laboratory analysis. Results can take several days and sometimes weeks to return, accounting for the high rates of loss to follow up as patients fail to return for a secondary level screening or treatment. For example, even in the highly integrated Kaiser Permanente of Northern California health system in the United States, up to 25% of women who are found to have high-grade cytology results do not return to their follow-up exam, which could save their life.
Once it is made available to healthcare providers, automated visual evaluation has the potential to become the new gold standard in cervical cancer screening.
The automated visual evaluation algorithm stands to correct these challenges. Providing an immediate result at the point of care, it can reduce loss to follow up as secondary screening could be offered immediately. The algorithm offers a vastly superior level of accuracy than Pap and as it assesses the presence of cancerous lesions (CIN2+), the algorithm reduces the issues of over-diagnosing that can occur with HPV testing.
Who created the AVE algorithm?
There were multiple stages in the creation of the AVE algorithm. A number of international partner organizations, including MobileODT, worked on the initial phase of the algorithm. That research was then passed to the National Cancer Institute (NCI) which built it further using cervical images provided by NCI. The National Library of Medicine validated the algorithm using another set of cervical images.
How was the AVE algorithm developed?
There were multiple stages in the creation of the automated visual evaluation algorithm as reported by the Journal of the National Cancer Institute. A number of international partners, including MobileODT, worked on the initial phase of the algorithm.
That research was then passed to the National Cancer Institute (NCI) which built it further using cervical images provided by NCI. The National Library of Medicine validated the algorithm using another set of cervical images. This formed the base of the research reported in the NCI journal as a proof of concept.
MobileODT research has now remade and retrained the algorithm using images taken by the EVA System. Together with research partners, the MobileODT AVE algorithm is now being validated in trials around the world using biopsy results as ground truth.
How does the AVE algorithm work?
All versions of the automated visual evaluation algorithm follow the same basic formula, first developed as a proof-of-concept using the Faster R-CNN approach to deep learning object detection. The algorithm performs three different functions before a result is recorded. It simultaneously detects the cervix within the image and extracts the features within the image, before it classifies the image giving a prediction of whether the features found in the image are a positive match for cancer or precancerous lesions.
To achieve this, the algorithm compares three sets of data; images of the cervix, input on the location of the cervix within the images and ‘class found truth’ informing whether or not the images show high-grade abnormalities (which was provided by biopsy data.)
The NCI automated visual evaluation algorithm was trained using data from an NCI study (see below for full details.) The MobileODT AVE algorithm is trained and validated using images captured from EVA System users and research partners and has been built to date on the Keras framework, using TensorFlow to make it more widely available. Together with research partners, the MobileODT AVE algorithm is now being validated in trials around the world using biopsy results as ground truth.
How was AVE validated in the NCI study?
The creation of Augmented Intelligence (AI) or deep-learning algorithms require a reliable data source to provide the system with the necessary resource material to serve as a base for analysis. The potential value and reliability of a computer-based medical system are only as great as the quality of its source data. The National Cancer Institute (NCI) of the United States’s National Institutes of Health (NIH) was uniquely positioned to provide the necessary data set to create automated visual evaluation. An NCI longitudinal study, which took place in Guanacaste, Costa Rica from 1993-2000, was chosen to provide the necessary training data.
The study collected pairs of cervical images at approximately 30,000 screening visits in that seven-year time span. 9,406 women took part in the study with ages ranging from 18-94. All participants received screening in the first year to create a baseline for the study. Women were then asked to return for periodic additional screening depending on their screening results and the likelihood of them developing cervical cancer. All screening visits included multiple types of cytology (including traditional chemical Pap smear testing), and HPV testing. Each image of the cervix was sent to an expert in the USA for evaluation (in a process called cervicography), and any diagnosis and treatment were recorded.
For the purposes of developing the NCI automated visual evaluation algorithm, the data was then linked with data from the tumor registry for 18 years. This enabled researchers to see clearly whether the diagnosis given during the study was indeed accurate and how many women had developed cervical cancer despite having been screened. With the images generated in this study, data scientists received images of healthy and pre/cancerous cervixes, together with confirmation of which patients later developed cervical cancer.
In order to allow for external validation of the algorithm, not all the study data was used in the initial development of the algorithm.The initial development used a random selection of 70% of the images with 30% set aside for the later validation test. The National Library of Medicine (NLM), a NIH affiliated institution, conducted the validation testing.
Although two images were taken at patient visit, when comparing results for pairs of images, the algorithm reproduced the same results so well that the following analysis only required one image.
How accurate is cervical cancer screening with AVE?
The NCI proof-of-principle AVE algorithm is currently outperforming both Pap cytology and examination by expert colposcopists.
The dataset provided by the NCI included 241 confirmed cases of precancer and 38 cases of cancer. The algorithm was able to accurately predict cancer and precancer (dysplasia) in over 90% of cases. There were 29 cases in which the AVE algorithm produced results different from those of an expert human colposcopist. In 28 instances, biopsy results confirmed that the machine prediction was correct over human assessment. The research conducted by the NCI indicates that AVE is more effective at cancer detection than human experts or the traditional chemical Pap test.
How can this new AVE technology be used in a clinical setting?
Automated visual evaluation has the potential to change the face of cervical cancer screening. In settings where either Pap or HPV testing is in use, AVE has the potential to replace the frequently inaccurate Pap tests as either a co-test to HPV or first line screening. Patients would benefit from immediate results at the point of care without the 2-6 weeks wait for test results that is common for Pap tests. The immediate results provided by AVE could provide rapid risk-stratification and immediately refer patients to secondary screening, to treatment, or to routine screening.
When used as a cotest with HPV, automated visual evaluation will help with resource allocation, reducing the over-referral that is currently found in healthcare systems that currently send all HPV 16 or 18 positive patients to secondary screening (colposcopy). AVE testing would ensure that only those patients who truly need further testing, those presenting CIN2 or above, be referred.
In low resource setting where VIA (visual inspection with acetic acid) is used as primary screening, AVE could provide a more accurate test without adding any additional visits or lab costs. Used on the portable EVA System, MobileODT’s AVE algorithm promises to bring accurate cervical cancer screening to women all around the world. The user-friendly interface on the EVA device will allow even non-expert clinicians to provide the AVE test dramatically increasing access to care.
The EVA System makes MobileODT uniquely positioned to bring AVE to commercial application.
The EVA System incorporates a digital colposcope, mobile app, and online portal. The synthesis of hardware and software offer a unique ability to integrate the AVE algorithm into the existing system.
Currently, healthcare providers use EVA in 50 healthcare systems throughout the USA and in 29 countries worldwide. When cleared by regulatory bodies, the AVE algorithm will be available to those EVA users as a simple software update.
The MobileODT Enhanced Visual Assessment (EVA) System captures image data and clinically relevant annotations added by clinicians as part of their standard-of-care examinations. In addition to this image and annotation data from each exam, follow-up records and secondary test information from laboratory tests are often added to the exam data.
Any machine learning algorithm needs to be trained on a dataset similar to the data that it will be evaluating. For this reason, for AVE to work in a live clinical setting, the algorithm must be trained and validated on data that is similar to what it will be evaluating in real-time. The EVA System has done so and has an alpha version of AVE already available for select partners.
In collecting data through real-world use globally, MobileODT is able to create augmented intelligence (AI) tools based on evidence that is sufficiently randomized by provider as to address usability and scalability concerns. The resulting algorithmic models and classifiers developed are significantly more robust than those generated through isolated clinical trials with specially selected clinicians trained to utilize a device, and patient populations limited to a small number of sites. AVE is just the first of the assistive algorithms MobileODT will make available on the EVA System in the years to come.
What’s next for the AVE algorithm?
MobileODT is in active development of a clinical version of the automated visual evaluation algorithm which will function on the EVA System.
Early in January, MobileODT announced a large scale pilot validating a clinical application of the AVE algorithm in India. Other validation projects are taking place around the world with MobileODT partners. Many existing users have enthusiastically joined in contributing de-identified data to help bring the promise of AVE into practice.
MobileODT is currently in the process of obtaining FDA and CE approval for adding AVE as an additional service on the EVA System, and select users can already use an AVE-based Clinical Decision Support (CDS) system today on the EVA System web portal. MobileODT’s full AVE offering should become available to healthcare providers seeking to more immediately and accurately identify women at risk of cervical cancer towards the end of 2019.
Want to know more? Contact us at email@example.com for any question about the AVE and the EVA System.