Improving Crohn’s disease lesion detection in capsule endoscopy with an advanced feature pyramid network

Petagna L, Antonelli A, Ganini C et al (2020) Pathophysiology of Crohn’s disease inflammation and recurrence. Biol Direct 15(1):23. https://doi.org/10.1186/s13062-020-00280-5

Article  CAS  PubMed  PubMed Central  Google Scholar 

Marin-Santos D, Contreras-Fernandez J, Perez-Borrero I et al (2023) Automatic detection of Crohn disease in wireless capsule endoscopic images using a deep convolutional neural network. Appl Intell 53(10):12632-12646. https://doi.org/10.1007/s10489-022-04146-3

Article  Google Scholar 

Ren S, He K, Girshick R, et al (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28

Zhu X, Su W, Lu L, et al (2020) Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:201004159

Lin TY, Dollar P, Girshick R, et al (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2117–2125

Liu S, Qi L, Qin H, et al (2018) Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 8759–8768

Ma X, Dai X, Yang J, et al (2024) Efficient modulation for vision networks. In: The twelfth international conference on learning representations. https://openreview.net/forum?id=ip5LHJs6QX

Ma J, Chen B (2020) Dual refinement feature pyramid networks for object detection. arXiv preprint arXiv:2012.01733

Xu W, Wan Y (2024) Ela: Efficient local attention for deep convolutional neural networks. arXiv preprint arXiv:240301123

Liu S, Huang D, Wang Y (2019) Learning spatial fusion for single-shot object detection. arXiv preprint arXiv:191109516

Ghiasi G, Lin TY, Le QV (2019) Nas-fpn: Learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 7036–7045

Yang G, Lei J, Zhu Z, et al (2023) Afpn: asymptotic feature pyramid network for object detection. In: IEEE transactions on systems, man, and cybernetics, pp 2184–2189

Ouyang D, He S, Zhang G et al (2023) Efficient multi-scale attention module with cross-spatial learning. ICASSP 2023–2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 1–5

Google Scholar 

He K, Zhang X, Ren S, et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

Huang G, Liu Z, Van Der Maaten L, et al (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708

Liu Z, Mao H, Wu CY, et al (2022) A convnet for the 2020s. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11976–11986

Guo MH, Lu CZ, Liu ZN et al (2023) Visual attention network. Comput Vis Media 9(4):733–752

Article  Google Scholar 

Liu Z, Lin Y, Cao Y, et al (2021) Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10012–10022

Ding X, Zhang X, Han J et al (2022) Scaling up your kernels to 31x31: Revisiting large kernel design in CNNs. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 11963–11975

Google Scholar 

Liu S, Chen T, Chen X, et al (2023) More convnets in the 2020s: scaling up kernels beyond 51 × 51 using sparsity. In: Proceedings of the International Conference on Learning Representations (ICLR)

Lou M, Zhou HY, Yang S, et al (2023) Transxnet: learning both global and local dynamics with a dual dynamic token mixer for visual recognition. arXiv preprint arXiv:231019380

Ma X, Dai X, Bai Y, et al (2024) Rewrite the stars. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 5694–5703

Hou Q, Zhou D, Feng J (2021) Coordinate attention for efficient mobile network design. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 13713–13722

Wang CY, Liao HYM, Wu YH, et al (2020) Cspnet: a new backbone that can enhance learning capability of CNN. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 390–391

Chollet F (2017) Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1251–1258

Chen J, Kao S, He H et al (2023) Run, don’t walk: Chasing higher flops for faster neural networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 12021–12031

Google Scholar 

Szegedy C, Ioffe S, Vanhoucke V, et al (2017) Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Proceedings of the AAAI conference on artificial intelligence

Lin TY, Maire M, Belongie S, et al (2014) Microsoft coco: common objects in context. In: Computer vision-ECCV 2014: 13th European conference, Zurich, Sept 6–12, 2014, Proceedings, Part V 13, Springer, pp 740–755

Paszke A, Gross S, Massa F, et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems (NeurIPS)

Chen K, Wang J, Pang J, et al (2019) MMDetection: open mmlab detection toolbox and benchmark. arXiv preprint arXiv:190607155

Loshchilov I (2017) Decoupled weight decay regularization. arXiv preprint arXiv:171105101

Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255

Chapter  Google Scholar 

He K, Zhang X, Ren S, et al (2015) Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034

Sun P, Zhang R, Jiang Y, et al (2021) Sparse R-CNN: end-to-end object detection with learnable proposals. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 14454–14463

Peng Y, Zhang Y, Tu B et al (2022) Spatial-spectral transformer with cross-attention for hyperspectral image classification. IEEE Trans Geosci Remote Sens 60:1–15

Google Scholar 

Zhang H, Chang H, Ma B, et al (2020) Dynamic R-CNN: Towards high quality object detection via dynamic training. In: Computer vision-ECCV 2020: 16th European conference, Glasgow, Aug 23–28, 2020, Proceedings, Part XV 16, Springer, pp 260–275

Liu S, Li F, Zhang H, et al (2022) Dab-detr: dynamic anchor boxes are better queries for detr. arXiv preprint arXiv:220112329

Meng D, Chen X, Fan Z, et al (2021) Conditional detr for fast training convergence. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3651–3660

Zhang H, Li F, Liu S, et al (2022) Dino: Detr with improved denoising anchor boxes for end-to-end object detection. arXiv preprint arXiv:220303605

Zhang S, Wang X, Wang J, et al (2023) Dense distinct query for end-to-end object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7329–7338

Pang J, Chen K, Shi J, et al (2019) Libra R-CNN: towards balanced learning for object detection. In: Proceedings of IEEE conference on computer vision and pattern recognition, pp 821–830

Wang J, Chen K, Xu R, et al (2019) Carafe: content-aware reassembly of features. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 3007–3016

Chen K, Cao Y, Loy CC, et al (2020) Feature pyramid grids. arXiv preprint arXiv:2004.03580

Comments (0)

No login
gif