A pinboard by
George Ng

Ph.D in Biotechnology who has joined a eCommerce startup in Hong Kong


We review Machine Learning models that has shown promise in the early detection of breast cancer

Breast Cancer And Machine Learning Breast cancer is cancer that starts in breast tissues and 5 -10% are hereditary causes due to mutation of the BRCA1 and BRCA2 genes - . Artificial intelligence is when a computer exhibits intelligence by mimicking 'cognitive' functions that humans associate with learning and problem solving. Breast cancer detection using Machine Learning Models - a sub-field of Artificial Intelligence, has shown promise. Feature extraction based on a new hybrid feature selection scheme, is used as evaluation criterion for the classification of breast malignant tumors in ultrasound images has shown a higher classification accuracy of 96.6% compared with back-propagation artificial neural network.

Breast Cancer Cluster You can't miss the ABC News Studio and it's transmission tower as you drive from the Brisbane city to the Toowong suburb. The 'Armstrong review' had found that risk of breast cancer in women working at the ABC at Toowong from January 1, 1994, to June 30, 2006, was “six times higher than that in the general population of women in Queensland” 12 years, 18 breast cancer diagnosis, 1 death and 3 review reports later ABC management decided evacuation was the only option. In 2016 the Toowong ABC Studio became synonymous with the world's first breast cancer cluster closed for good.


Fully automated breast boundary and pectoral muscle segmentation in mammograms.

Abstract: Breast and pectoral muscle segmentation is an essential pre-processing step for the subsequent processes in computer aided diagnosis (CAD) systems. Estimating the breast and pectoral boundaries is a difficult task especially in mammograms due to artifacts, homogeneity between the pectoral and breast regions, and low contrast along the skin-air boundary. In this paper, a breast boundary and pectoral muscle segmentation method in mammograms is proposed. For breast boundary estimation, we determine the initial breast boundary via thresholding and employ Active Contour Models without edges to search for the actual boundary. A post-processing technique is proposed to correct the overestimated boundary caused by artifacts. The pectoral muscle boundary is estimated using Canny edge detection and a pre-processing technique is proposed to remove noisy edges. Subsequently, we identify five edge features to find the edge that has the highest probability of being the initial pectoral contour and search for the actual boundary via contour growing. The segmentation results for the proposed method are compared with manual segmentations using 322, 208 and 100mammograms from the Mammographic Image Analysis Society (MIAS), INBreast and Breast Cancer Digital Repository (BCDR) databases, respectively. Experimental results show that the breast boundary and pectoral muscle estimation methods achieved dice similarity coefficients of 98.8% and 97.8% (MIAS), 98.9% and 89.6% (INBreast) and 99.2% and 91.9% (BCDR), respectively.

Pub.: 14 Jun '17, Pinned: 15 Jun '17

Discriminating Solitary Cysts from Soft Tissue Lesions in Mammography using a Pretrained Deep Convolutional Neural Network.

Abstract: It is estimated that 7% of women in the western world will develop palpable breast cysts in their lifetime. Even though cysts have been correlated with risk of developing breast cancer, many of them are benign and do not require follow-up. We develop a method to discriminate benign solitary cysts from malignant masses in digital mammography. We think a system like this can have merit in the clinic as a decision aid or complementary to specialised modalities.We employ a deep Convolutional Neural Network (CNN) to classify cyst and mass patches. Deep CNNs have been shown to be powerful classifiers, but need a large amount of training data which for medical problems is often difficult to come by. The key contribution of this paper is that we show good performance can be obtained on a small dataset by pretraining the network on a large dataset of a related task. We subsequently investigate the following: (1) when a mammographic exam is performed, two different views of the same breast are recorded. We investigate the merit of combining the output of the classifier from these two views. (2) We evaluate the importance of the resolution of the patches fed to the network. (3) A method dubbed tissue augmentation is subsequently employed, where we extract normal tissue from normal patches and superimpose this onto the actual samples aiming for a classifier invariant to occluding tissue. (4) We combine the representation extracted using the deep CNN with our previously developed features.We show that using the proposed deep learning method, an Area Under the ROC Curve (AUC) value of 0.80 can be obtained on a set of benign solitary cysts and malignant mass findings recalled in screening. We find that it works significantly better than our previously developed approach by comparing the AUC of the ROC using bootstrapping. By combining views, the results can be further improved, though this difference was not found to be significant. We find no significant difference between using a resolution of 100 versus 200 micron. The proposed tissue augmentations give a small improvement in performance, but this improvement was also not found to be significant. The final system obtained an AUC of 0.80 with 95% confidence interval [0.78, 0.83], calculated using bootstrapping. The system works best for lesions larger than 27mm where it obtains an AUC value of 0.87.We have presented a Computer Aided Diagnosis (CADx) method to discriminate cysts from solid lesion in mammography using features from a deep Convolutional Neural Network (CNN) trained on a large set of mass candidates, obtaining an AUC of 0.80 on a set of diagnostic exams recalled from screening. We believe the system shows great potential and comes close to the performance of recently developed spectral mammography. We think the system can be further improved when more data and computational power becomes available. This article is protected by copyright. All rights reserved.

Pub.: 18 Jan '17, Pinned: 18 Apr '17

High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks

Abstract: Recent advances in deep learning for object recognition in natural images has prompted a surge of interest in applying a similar set of techniques to medical images. Most of the initial attempts largely focused on replacing the input to such a deep convolutional neural network from a natural image to a medical image. This, however, does not take into consideration the fundamental differences between these two types of data. More specifically, detection or recognition of an anomaly in medical images depends significantly on fine details, unlike object recognition in natural images where coarser, more global structures matter more. This difference makes it inadequate to use the existing deep convolutional neural networks architectures, which were developed for natural images, because they rely on heavily downsampling an image to a much lower resolution to reduce the memory requirements. This hides details necessary to make accurate predictions for medical images. Furthermore, a single exam in medical imaging often comes with a set of different views which must be seamlessly fused in order to reach a correct conclusion. In our work, we propose to use a multi-view deep convolutional neural network that handles a set of more than one high-resolution medical image. We evaluate this network on large-scale mammography-based breast cancer screening (BI-RADS prediction) using 103 thousand images. We focus on investigating the impact of training set sizes and image sizes on the prediction accuracy. Our results highlight that performance clearly increases with the size of training set, and that the best performance can only be achieved using the images in the original resolution. This suggests the future direction of medical imaging research using deep neural networks is to utilize as much data as possible with the least amount of potentially harmful preprocessing.

Pub.: 21 Mar '17, Pinned: 18 Apr '17