Enriched lung cancer classification approach using an optimized hybrid deep learning approach - Scientific Reports

The Horse Herd Optimization with Lion Optimization Algorithm HHO-LOA addresses the limitations in previous works by optimizing the LSTM classifier for lung cancer image classification. It enhances the training process by selecting the best parameters, reducing underfitting, and improving classification accuracy. The DCNN also extracts pathological features more effectively, mitigating issues like feature redundancy and nonexistent patterns. This combined approach ensures robust and accurate detection of lung cancer across diverse datasets, overcoming scalability, feature extraction, and generalization challenges seen in earlier studies. Table 1 shows the comparison table on lung cancer classification existing research.

Classifying lung cancer is essential for assessing the disease and determining appropriate treatment decisions based on its types. DL, a subfield of ML, has recently demonstrated exceptional performance, particularly in classification and segmentation tasks for CT image analysis. However, selecting suitable parameters and preprocessing methods is challenging in promoting classifier performance. A Hybrid Optimized DNN (HODNN) approach for optimal FS and accurate classification is presented. It consists of hybrid techniques for FS and variety. The common workflow of the suggested approach is shown in Fig. 1.

Figure 1 illustrates the complete workflow of the LCC approach, encompassing four phases: pre-processing, segmentation, DCNN based feature extraction, and classification as a novel contribution to the parameter tuning using a hybrid algorithm called a HHO with LOA. Parameter optimization is performed for the LSTM classifier to improve model performance significantly.

We opted for a one-slice DCNN model rather than 3D CNN or volumetric transformer models primarily due to limitations in computational resources and pragmatic concerns regarding clinical implementation. Although 3D models can learn spatial continuity across slices, they are significantly more computationally costly in terms of GPU memory needs and expense and are not viable to be implemented in resource-constrained clinical environments. Additionally, excessive inter-slice variation in CT volumes and non-uniform slice thickness between cohorts add noise to 3D modeling. Our method leverages optimally chosen, diagnostically meaningful axial slices with optimized feature enhancement using hybrid HHO-LOA optimization to facilitate effective nodule-level classification without sacrificing computational efficiency. We do acknowledge volumetric and transformer-based models' potential and share them as future works upon the resolution of computational and annotation hurdles.

Dataset link: https://www.kaggle.com/datasets/hgunraj/cancer-net-pca-data.

This study used the SPIE-AAPM-NCI Lung CT Challenge dataset, which is free to the public and contains chest CTs with lung nodule annotations by expert radiologists. The dataset contains volumetric CT images of patients with suspected or confirmed lung cancer that are annotated with nodule boundaries and malignancy labels. For the purposes of our classification task, we limited our analysis to nodules that had well-defined labels -- i.e. nodules which had been determined to be either benign or malignant by the experts using histopathological and radiological criteria. We have performed additional curation of the publicly available dataset in order to trim out samples with private and unknown lesions. The images were processed and denoised using adaptive denoising filters. The images were then resampled and normalized. The relevant axial slices of the CT scans that centered on the nodule area were extracted and cropped to 224 × 224 pixels. After the preparation stage, we sampled our dataset into 70% training split, and 15% for both testing and validation splits ensuring that we kept a representative sample of benign and malignant across our splits. Furthermore, we used standard data augmentation techniques, such as rotation, flipping, and contrast adjustments, to help with generalization and improve overfitting. In this method, 80% of the nodes are randomly chosen for the training dataset, and the remaining 20% are reserved for the test dataset (Fig. 2).

The lung cancer CT image dataset includes both benign and malignant sections for classification. These images undergo preprocessing, where adaptive noise removal filter techniques are applied. Adaptive filtering is designed to improve image contrast while enhancing overall image quality. This technique eliminates noise by suppressing low or high-frequency pixels and highlighting or detecting image edges. As a non-linear filter, adaptive filtering effectively removes noise from lung images by replacing noisy pixels with the median value of surrounding pixels, sorted based on the grey level of the image. is given based on Eq. (1) when the adaptive filter is implemented for the input image

In Eq. (1), the original and the adaptive filtered image are denoted as and, respectively. Moreover, an a2-dimensional mask is indicated by H. Therefore, the final preprocessed image is represented as and further subjected to lung segmentation.

Figure 3 illustrates the original and preprocessed image results. It shows the original lung scan image, noise-filtered image result, and edge-detected image result. The preprocessed images are utilized to detect lung cancer.

Adaptive dual-thresholding is utilized for segmentation. Empirically determined intensity thresholds in the range of [90-140 HU] are utilized for segmentation of pixel intensities for detecting possible lesions. These values are soft tissue radiodensity features on a vast array of CT scanners and were tuned to identify nodule edges without over-segmentation. The method is adequate on a wide range of scanner types owing to two reasons: Histogram equalization normalizes intensity distribution prior to segmentation. Threshold calibration was done using validation images acquired from various sources of CT machines to ensure generalizability.

While there are more sophisticated techniques available, adaptive thresholding is computationally inexpensive and precise enough for preliminary lesion boundary localization in our pipeline. Later DCNN layers continue to refine feature learning. Pixel grouping via thresholding:

The Eq. (8) is representing the colored segmented image . It segments the regions by numerous the obtained using the grouping method by matching .

Figure 4 demonstrates segmentation results for different pixel intensity ranges (50-200), showing how lesion regions are effectively separated from normal lung tissue. The segmented images serve as input for feature extraction and lesion classification models, aiding in lung cancer detection. This pixel thresholding-based segmentation technique ensures accurate lesion isolation, providing a crucial foundation for further diagnostic analysis.

Deep Convolutional Neural Networks (DCNNs) have been applied extremely extensively in lung cancer classification because of their strong capacity to learn automatically spatial feature hierarchies from medical images like CT scans.DCNNs use stacks of convolutional filters to extract informative features such as nodules, textures, and patterns that signify cancer. The features are then fed through pooling and fully connected layers for ultimate classification. DCNNs outclass traditional methods by reducing hand feature engineering and improving diagnostic accuracy.

Hyperparameter tuning is a crucial step in optimizing DL models. In this case, we focus on using the HH-LOA to fine-tune the hyperparameters of an LSTM classifier for classifying lung cancer CT scan images. The workflow consists of two primary stages such as Feature Extraction phase and using DCNN and Hyperparameter Tuning phase using HHO-LOA. The phase 1 uses A DCNN extracts discriminative features from CT scan images of lungs. The extracted features are passed to an LSTM classifier for final classification. The phase 2 perform parameter tuning, it optimizes key hyperparameters of the LSTM model, such as Number of LSTM units (), Learning rate (), Batch size (), Dropout rate (, and Weight decay (). The optimization process aims to improve the classification accuracy while reducing computational complexity.

Deep Convolutional Neural Networks (DCNNs) have been applied extremely extensively in lung cancer classification because of their strong capacity to learn automatically spatial feature hierarchies from medical images like CT scans. DCNNs use stacks of convolutional filters to extract informative features such as nodules, textures, and patterns that signify cancer. The features are then fed through pooling and fully connected layers for ultimate classification. DCNNs outclass traditional methods by reducing hand feature engineering and improving diagnostic accuracy.

The DCNN is essential for processing CT scan images and extracting meaningful features for lung cancer classification. It employs convolution operations to identify crucial patterns such as edges, textures, and structural details within the images.

The representation of feature map at layer is given in Eq. (3). The variables denotes convolutional kernel, is the input image or the feature map from the previous layer, and is the bias term. The notation (·) is activation function, typically ReLU, introduces non-linearity to enhance feature extraction. Pooling layers are applied to refine the extracted features and reduce the spatial dimensions. These layers perform either max pooling or average pooling. Pooling layers helps to reduce spatial dimensions.

The variable used to perform max pooling or average pooling at layer using Eq. (4) and Eq. (5) respectively. Pooling helps retain the most important features while reducing computational complexity and preventing overfitting. Once the DCNN extracts the significant features, they are passed to an LSTM classifier, which utilizes sequential dependencies in the data to perform the final classification of lung cancer images. Table 2 shows the DCNN architecture overview.

The suggested DCNN model is very effective in spatial feature extraction of CT nodules because it is based on convolutional structure with layers, which detects low- to high-level patterns at various resolutions. Utilization of small kernel sizes and padding maintains the fine spatial detail, and dimension-reduction pooling layers preserve useful region-based information. This enables the network to well localize and differentiate between malignant and benign nodules based on shape, texture, and boundary changes.

The LSTM classifier has special features to handle the CT image features. The CNN-extracted features form a sequential pattern, which LSTM effectively learns. LSTM retains important features across multiple steps, ensuring relevant patterns influence the classification decision. Unlike standard RNNs, LSTM's gating mechanisms prevent information loss over long sequences. Table 3 shows the LSTM Head Overview.

The incorporation of the LSTM head reinforces the model to learn long-distance dependencies by holding sequential information constant across spatially separated features obtained by the DCNN. Temporal memory facilitates the model in learning contextual associations between nodule features, enhancing classification accuracy. It can seize decision patterns over multiple areas within the CT volume, which even normal CNNs may fail to capture.

The extracted features at time are sequentially processed using LSTM units, which learn temporal dependencies. LSTM units that update their states using gating mechanisms.

Forget gate operation is represented as in Eq. (6). It determines which information from the previous cell state should be retained or discarded.

Input Gate operation is represented as in Eq. (7). It determines which new information should be stored in the cell state.

Candidate Cell State is operation is represented as in Eq. (8). It computes new candidate values to update the cell state.

Cell State Update operation is represented as in Eq. (9). It combines the forget gate and input gate to update the memory

Output Gate is operation is represented as in Eq. (10). It determines the output at the current time step

Hidden state update is operation is represented as in Eq. (11). Here, , , and are the forget, input, and output gates, respectively. The notations are cell state at time , hidden state, weight matrices and biases.

After processing the entire sequence of extracted features, the final hidden state is passed through a softmax function to determine the probability of the image being normal or cancerous using softmax layer.

The final output is passed through a softmax function (in Eq. (12) for classification. The softmax function ensures that the output probabilities sum to 1, allowing the model to classify the CT scan image into either the normal or cancerous category.

By combining CNN for feature extraction and LSTM for classification, the model efficiently distinguishes between normal and cancerous lung CT scans. CNN extracts spatial features, while LSTM captures the temporal dependencies within them, leading to an accurate and robust lung cancer detection system.

Even though LSTMs are traditionally used for sequential data, using them in image classification is understandable if spatial or structural relationships are reformulated as sequential ones. For us, once high-level spatial features are achieved using convolutional layers, these are flattened to a sequence of vectors. Spatial progression in terms of rows or patches of the image is appropriately represented by the sequence. LSTM is used to train long-range dependencies in this spatial sequence -- so the model can learn patterns that occur between very far-apart areas of the image, potentially missed by simple CNN classifiers. This is especially effective on lung CT scans in which lesions may have weak spatial distinction or are in non-local areas.

Furthermore, LSTM enhances the feature interpretability by recalling past patterns while focusing on the current region, which is beneficial in recognizing benign and malignant structures, especially in noisy or partially segmented regions. Briefly, LSTM is not for processing raw images but for sequence-conscious feature interpretation of CNN-extracted features, introducing another layer of contextual insight.

The HHO algorithm is inspired by horse herding behavior and is used to explore the search space efficiently. It consists of exploration and exploitation phases.

The Horses move randomly within the hyperparameter space, in exploration phase using Eq. (13). The is the position of the th horse at iteration , is the best horse (best hyperparameter set found so far), is a randomly selected horse, and are control parameters.

The notation is the local search factor of the exploitation. The best horses refine their positions in exploitation phase using the Eq. (14).

The LOA algorithm is inspired by lion social behavior. It uses two primary strategies roaring and Hunting Mechanism.

Roaring strategies is expressed as in Eq. (15). The notation controls the intensity of exploration.

Hunting strategies (Exploitation) is expressed as in Eq. (15). The notation controls the intensity of exploration. The notation adjusts the convergence rate.

The fitness function plays a crucial role in optimizing the hyperparameters of the LSTM classifier by evaluating the performance of each candidate hyperparameter set. The HHO and LOA (HHO-LOA) employs a fitness function. By combining HHO (for diverse exploration) and LOA (for effective exploitation), the hybrid HH-LOA algorithm balances global and local search for hyperparameter tuning.

The fitness function is defined as in Eq. (17). The is a trade-off parameter. is the fitness score of a candidate hyperparameter set . The represents the classification accuracy of the LSTM classifier with hyperparameters X and measures the computational burden, including time and memory usage. The trade-off parameter that adjusts the importance of reducing computational cost relative to maximizing accuracy.

The parameter optimization focuses on the three factors such as accuracy maximization, computational cost control, and balance between accuracy and computational efficiency using . HHO (Exploration Phase) generates a variety of hyperparameter sets and evaluates their performance using F(X). It then filters out low-accuracy or high-cost solutions. LOA (Exploitation Phase) fine-tunes the best-performing hyperparameter sets, adjusting parameters to enhance Accuracy(X) while keeping computational costs under control. This phase ensures a locally optimized set of hyperparameters. The hybrid optimization algorithm continues iterating until it discovers a hyperparameter configuration that maximizes F(X). The final selection achieves high classification accuracy with minimal resource usage.

The first term in Eq. (17), , ensures that the primary goal of hyperparameter tuning is to maximize classification accuracy on lung cancer images. The HHO phase explores different hyperparameter combinations to find the best candidates for high accuracy. The LOA phase refines these candidates to further boost accuracy while avoiding overfitting. The second term in Eq. (17),, penalizes models that are computationally expensive. Computational cost includes number of LSTM units ( - Higher units increase memory usage), Batch size ( - Larger batches require more computation), learning rate (- it impacts the number of training iterations), dropout rate (- it affects model complexity), and weight decay ( - it is a regularization term controlling overfitting). By subtracting computational cost from accuracy, the fitness function favors efficient models that achieve high accuracy with lower resource consumption. The value of determines the balance between accuracy and computational efficiency. If is too small, the algorithm prioritizes accuracy and may select very complex models and if is too large, the algorithm favors models with low computational cost, possibly at the expense of accuracy. Tuning ensures an optimal balance, allowing the HH-LOA to find a hyperparameter set that performs well without excessive computational overhead. The fitness function in HH-LOA ensures that hyperparameter tuning is not just about maximizing accuracy but also about keeping computational cost manageable. By incorporating a trade-off parameter , the algorithm strikes a balance between performance and efficiency, leading to an optimal LSTM classifier for lung cancer classification with minimal resource consumption.

To assess the classification performance at the nodule level, ROC and precision-recall curves were created from the predicted probabilities per nodule. AUC values were then calculated to measure the model's ability to discriminate malignant from benign nodules. 95% CI for AUC were estimated using bootstrapping.

In order to maximize classification accuracy and computational efficiency, we formulated a multi-objective fitness function to throttle both of these competing objectives using a trade off parameter λ. We define the fitness function as:

In this case, Accuracy is derived from model validation performance, and Computation Cost is derived from the estimated number of floating point operations (FLOPs) per forward pass averaged and ranged from 0 to 1. In the experiments described in this paper, we first selected λ empirically by cross-validation. We used grid search through values {0.1, 0.3, 0.5, 0.7, 0.9}, which indicated that λ = 0.7 yielded the best trade-off that allowed for high accuracy while also limiting the computation cost. Since we explored a trade-off before setting λ we fixed it during optimization to ensure that our runs were consistent.We also experimented with adaptive λ, but ultimately did not use it; it seemed to produce instability in the beginning of the hybrid optimization step. Future work may implement such λ updates based on learning rate scheduling or model confidence.

In response to any possible overfitting resulting from the high accuracy obtained in training, we utilized a multi-fold (k = 5), stratified cross-validation regime and applied some regularization methods like dropout (0.2) and batch normalization; as well as early stopping of training, to ensure we end our training upon convergence, to further reduce our overfitting issues. We have also indicated that there were limited and consistent performances across the folds, with limited variations (σ < 0.3%), to support generalisability of the model.

Figure 5 demonstrates a DL scheme that combines the Horse Herd Optimization Algorithm (HHO) with a Lion Optimization Algorithm (LOA) to achieve better classification results. This begins with input images that go forward through the Convolutional Neural Network (CNN) system through convolutional layers and dense layers for feature extraction. These extracted features feed through a Max Pooling Layer, then on to the LSTM (Long Short-Term Memory) layer with size = 5; it will help identify sequential patterns. The output from the LSTM layer is then flattens, and feeds through a SoftMax layer, in order to classify the inputs into either a normal or abnormal classification or label. The optimization (HHO-LOA) is likely to adjust the parameters or hyperparameters of the model in order to achieve better accuracy and efficiency for classification tasks.

The pseudo-code for the Optimized deep model mentioned above describes the step-by-step procedures of lung cancer detection approaches. The performance analysis of the proposed method is given in the next section.

This Fig. 6 provides a hybrid DL and optimization framework to classify CT images of either normal or cancer. The framework starts with feature extraction with a DCNN (Deep Convolutional Neural Network), then initialize DCNN and LSTM models, and a hybrid hyper-parameter optimization with Horse Herd Optimization (HHO) & Lion Optimization Algorithm (LOA) (HHO-LOA) for model tuning. During optimization, HHO and LOA go through their exploration phases, evaluate candidate solutions for iterative updates until convergence criteria have been met. When the model is tuned, feature extraction can be performed and the LSTM clasifier uses the optimal parameters to classify the features. After classification, the performance evaluation, return the final classified image.

Enriched lung cancer classification approach using an optimized hybrid deep learning approach - Scientific Reports

POPULAR CATEGORY

misc

entertainment

corporate

research

wellness

athletics