Pushing the Limits of Gradient Descent for Efficient Learning on Large Images

Gupta, Deepak Kumar; Mago, Gowreesh; Chavan, Arnav; Prasad, Dilip K.; Thomas, Rajat Mani

dc.contributor.author	Gupta, Deepak Kumar
dc.contributor.author	Mago, Gowreesh
dc.contributor.author	Chavan, Arnav
dc.contributor.author	Prasad, Dilip K.
dc.contributor.author	Thomas, Rajat Mani
dc.date.accessioned	2025-03-20T09:57:52Z
dc.date.available	2025-03-20T09:57:52Z
dc.date.issued	2024
dc.description.abstract	Traditional deep learning models are trained and tested on relatively low-resolution images (< 300 px), and cannot be directly operated on large-scale images due to compute and memory constraints. We propose Patch Gradient Descent (PatchGD), an effective learning strategy that allows us to train the existing CNN and transformer architectures (hereby referred to as deep learning models) on large-scale images in an end-to-end manner. PatchGD is based on the hypothesis that instead of performing gradient-based updates on an entire image at once, it should be possible to achieve a good solution by performing model updates on only small parts of the image at a time, ensuring that the majority of it is covered over the course of iterations. PatchGD thus extensively enjoys better memory and compute efficiency when training models on large-scale images. PatchGD is thoroughly evaluated on PANDA, UltraMNIST, TCGA, and ImageNet datasets with ResNet50, MobileNetV2, ConvNeXtV2, and DeiT models under different memory constraints. Our evaluation clearly shows that PatchGD is much more stable and efficient than the standard gradient-descent method in handling large images, especially when the compute memory is limited. Code is available at https://github.com/nyunAI/PatchGD.	en_US
dc.description	Source at <a href=https://www.jmlr.org/tmlr/index.html>https://www.jmlr.org/tmlr/index.html</a>.	en_US
dc.identifier.citation	Gupta, Mago, Chavan, Prasad, Thomas. Pushing the Limits of Gradient Descent for Efficient Learning on Large Images. Transactions on Machine Learning Research (TMLR). 2024;2024	en_US
dc.identifier.cristinID	FRIDAID 2367408
dc.identifier.issn	2835-8856
dc.identifier.uri	https://hdl.handle.net/10037/36731
dc.language.iso	eng	en_US
dc.publisher	TMLR	en_US
dc.relation.journal	Transactions on Machine Learning Research (TMLR)
dc.relation.uri	https://openreview.net/pdf?id=6dS1jhdemD
dc.rights.accessRights	openAccess	en_US
dc.rights.holder	Copyright 2024 The Author(s)	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0	en_US
dc.rights	Attribution 4.0 International (CC BY 4.0)	en_US
dc.title	Pushing the Limits of Gradient Descent for Efficient Learning on Large Images	en_US
dc.type.version	publishedVersion	en_US
dc.type	Journal article	en_US
dc.type	Tidsskriftartikkel	en_US
dc.type	Peer reviewed	en_US

Tilhørende fil(er)

Navn:: article.pdf
Størrelse:: 5.432Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Artikler, rapporter og annet (informatikk) [484]

Vis enkel innførsel

Med mindre det står noe annet, er denne innførselens lisens beskrevet som Attribution 4.0 International (CC BY 4.0)