Institutional Repository

Few-shot learning for image classification and object detection

Show simple item record

dc.contributor.advisor Sanders, I. D.
dc.contributor.author Zimudzi, Edward
dc.date.accessioned 2022-12-06T11:10:34Z
dc.date.available 2022-12-06T11:10:34Z
dc.date.issued 2021-11
dc.identifier.uri https://hdl.handle.net/10500/29677
dc.description.abstract Deep learning has successfully been applied in computer vision, including in image classification, object recognition and detection, and in image segmentation in applications such as remote sensing, scene understanding, autonomous driving, medical image analysis, robotics and video surveillance. The drawback of the majority of current approaches is that they demand huge quantities of annotated training data to produce results, and they use quite expensive computing resources. Data annotation is usually an expensive and tedious task. On the other hand, data can be rare or difficult to gather for some reasons, including for safety and ethical issues. Moreover, a deep learning model trained successfully for a specific task cannot be directly deployed for another task in another domain. It is therefore essential to develop models that can learn from few annotated samples of training data like humans do. Few-shot learning addresses the problem of closing the gap into deep learning models that learn from huge annotated datasets and humans in the challenging task of learning from few examples. The aim of this thesis is to propose novel methods in deep learning image processing that optimize the model’s ability to detect and recognise new object instances using few labelled data. We present several novel methods that tackle the problems of image classification, object detection, self-supervised knowledge distillation, and panoptic segmentation in fewshot learning settings. Even though multiple computer vision themes can be identified throughout this work, the most important is the limited data regime taken into account. We consider the few-shot learning setting where tasks associated with their support and query test data are received and trained in episodes. We introduce a novel few-shot metalearning classification model that consists of multiple learners supervised by a central controller to control a feature extraction and meta-learning for integrated inference and generalisation. Secondly, we introduce an approach for few-shot object detection that meta-learns object localisation and classification by eliminating region-wise prediction, and encoding support images and query images simultaneously into class-specific feature representations that automatically enters into a class-agnostic decoder to generate output predictions for the categories known beforehand. We also introduce a fully convolutional model for panoptic segmentation in few-shot settings that encodes each instance into a specific kernel and generates a prediction by convolutions directly, thereby predicting both instance objects and background stuff together. In this way, instance-aware and semantically consistent properties for object instances and their background can be respectively satisfied in a unified workflow. Finally, we introduce a two-stage knowledge distillation model that maximises the entropy of the feature embeddings of images using a self-supervised auxiliary loss. Experiments on some public few-shot learning benchmark datasets such as miniImageNet, Omniglot, COCO-20i and Mapillary Vistas demonstrate the effectiveness of the proposed methods for few-shot learning in computer vision. en
dc.description.abstract Le ndaba ende yethula futhi ithuthukise izifanekiso zamanoveli amaningana abhekana nezinkinga zokubono zekhompyutha zokuhlukaniswa kwemifanekiso, ukutholwa kwezinto, ukuhluzwa kolwazi oluzigadile kanye nendlela yokuhlukanisa umfanekiso osetshenziselwa imisebenzi yokubona yekhompyutha eqoqweni lendlela yokufunda yomshini. Indlela yokufunda yomshini ihlose ukuvala igebe phakathi kwezifanekiso zokufunda ezijulile ezifunda eqoqweni elikhulu lemininingwane yolwazi ehlobene, ezinezichasiselo nakubantu emsebenzini oyinselele wokufunda ezibonelweni ezimbalwa ezinezichasiselo. Ngisho noma izindikimba eziningi zokubono zekhompyutha zingabonakala kuwo wonke lo msebenzi, okubaluleke kakhulu uhlelo lwemininingwane olulinganiselwe olucatshangelwayo. Sicabangela iqoqo lendlela yokufunda yomshini lapho imisebenzi yokubona yekhompuyutha inemininingwane elinganiselwe ehlotshaniswa nokusekelwa kwayo kanye neminingwane yokuhlola imibuzo iyatholwa futhi iqeqeshwe ngeziqephu. Okokuqala sethula isifanekiso sokuhlukaniswa kokufunda ukufunda kwenoveli okumbalwa okuhlanganisa abafundi abaningi abagadwe yisilawuli esimaphakathi ukuze kulawulwe ukukhishwa kwesici kanye nokufunda ukufunda ukuze kufinyelelwe ekucabangeni okudidiyelwe kanye nokujwayelekile, Okwesibili, sethula indlela yokuthola izinto zendlela yokufunda yomshini ethola futhi ebona izimo zento entsha ngokufunda ukufunda into ibe yasendaweni kanye nokuhlukaniswa ngendlela ebumbene, ngokususa ukuqagela okuhlakaniphile kwesifunda kanye nokubhala ngekhodi kokubili imifanekiso esekelayo nemibuzo yemifanekiso kube isigaba esithize. sezici ezibese zingena kudivayisi ejwayelekile yesigaba ukuze sikhiqize izibikezelo zezigaba ezithile. Siphinde sethula isifanekiso somphumela wokuhlunga inhloso evamile yemifanekiso ngokugcwele ngendlela yokuhlukanisa umfanekiso osetshenziselwa imisebenzi yokubona yekhompyutha eqoqweni lendlela yokufunda yomshini elihlanganisa isenzakalo ngasinye sibe uhlamvu oluthile futhi sikhiqize ukubikezela ngokuhlunga inhloso evamile yemifanekiso ngokuqondile, ngaleyo ndlela ibikezele kokubili izinto eziyisibonelo nezinto zasemuva ndawonye. Ngale ndlela, izakhiwo eziqaphelayo nezingaguquguquki ngokwezibalo zezenzakalo zento kanye nengemuva lazo zinganeliswa ngokulandelana kwazo ekuhambeni komsebenzi okuhlangene. Ekugcineni, sethula isifanekiso sezigaba ezimbili zolwazi oluhluziwe esenza isimo sokuphazamiseka sibe sikhulu sesici esishumekiwe semifanekiso kusetshenziswa ukulahlekelwa komsizi ozigadile. Izivivinyo kwamanye amaqoqo emininingwane endinganiso yendlela yokufunda yomshini ezinjengeminiImageNet, i-Omniglot, i-CIFAR-FS ne-Oxford Flowers102 yokuhlukaniswa kwemifanekiso, i-Pascal 5i ne- COCO-20i yokuthola into, kanye ne-Mapillary Vistas yokuhlukaniswa kwendlela yomfanekiso osetshenziselwa imisebenzi yokubona yekhompyutha ibonisa imingcele yokusebenza kanye nempumelelo yezindlela ezihlongozwayo zendlela yokufunda yomshini. Le ndaba ende ihlose ukuvala igebe phakathi kokufunda okujulile okujwayelekile nokufunda komuntu ngokudala izinhlelo zokubona zekhompyutha ezifunda ezibonelweni ezimbalwa zemininingwane yemifanekiso. zu
dc.description.abstract Thesese ye e laetsa le go somisa mekgwa e meswa e mmalwa yeo e somanago le mathata a pono ya khomphutha a tlhopho ya diswantsho, temogo ya dilo, phetiso ya tsebo ya boitekolo le karogantsho ya dilo ka maemong a few-shot learning. Maikemisetso a few-shot learning ke go tswalela sekgoba magareng ga mehuta ya go tsenelela ya go ithuta yeo e hwetsago tsebo go tswa ditlhalosong tse di filwego tsa dihlopha tsa tshedimoso le batho ka mosomo o boima wa go ithuta ka mehlala ye mmalwa ye e hlalositswego. Le ge dikgwekgwe tsa pono ya dikhomphutha tse ntsi di ka bonwa mosomong wo ka moka, se bohlokwa kudu ke mokgwa wo o lekaneditswego wa datha wo o etswego hloko. Re ela hloko maemo a few-shot learning moo mesongwana ya pono ya khomphutha ya go ba le datha ya thekgo ya yona le datha ya teko ya dipotsiso di amogetswe gape ka ditiragalo. Re thoma ka go tsebagatsa mmotlolo wa tlhopho ya few-shot learning le metalearning tseo di nago le baithuti ba bantsi bao ba hlokomelwago ke molaodi wa bogare go laola go tloswa ga dilo le meta-learning sephetho se se kopantswego le kakaretso. Sa bobedi, re tsebisa mokgwa wa temogo ya dilo wa few-shot wo o lemogang le go amogela mehlala ya dilo tse diswa ka meta-learning le tlhopho ka mokgwa wa botee, ka go tlosa kakanyo ya kgakanego le go fetolela diswantsho tseo di thekgang le diswantsho tsa potsiso go dibopego tsa karolo ye e ikgethileng go karolo ya tlhathollo ya go se kgodise go tsweletsa dikakanyo tsa dikarolo tse itsego. Re tsebisitse gape mokgwa wo o feletsego wa go latela dilo tsa go raragana ka go somisa few-shot learning go romela dilo lifelong le itsego gomme tsa tsweletsa kakanyo ya dilo thwii, gomme ya fa kakanyo ya ditiragalo tsa dilo tse pedi mmogo. Ka tsela ye, mehlala ya mokgwa wo le dilo tseo di latelanago tsa semanthiki tsa mesomo ya dilo le botso bja tsona di ka latelana gabotse go mesomo ye e kopantswego. Mafelelong re tsebisa mmotlolo wa phetiso ya tsebo wa dikgato tse pedi woo o kaonafatsago maemo a go raragana ga dilo tse di lokelwago diswantshong ka go somisa tahlegelo ya thekgo ya boihlokomelo. Boitekelo godimo ga dihlopha tsa datha tse dingwe tsa bohle tsa tekanetso ya few-shot learning bjale ka miniImageNet, Omniglot, CIFAR-FS le Oxford Flowers102, Pascal 5i le COCO-20i ya temogo ya dilo, le Mapillary Vistas ya karogantsho ya dilo go laetsa magomo a tshepediso le go soma gabotse ga mekgwa ye e sisintswego ya few-shot learning. Maikemisetso a thesese ye ke go tswalela sekgoba magareng ga thuto ye e tseneletsego ya tlwaelo le go ithuta ga batho ka go hlama mananeo a pono a khomphutha ao a ithutago go tswa mehlaleng ye mmalwa ya datha ya diswantsho. nso
dc.format.extent 1 online resource (xxxiv, 214 leaves) : color illustrations, color graphs, color images
dc.language.iso en en
dc.subject Few-shot learning en
dc.subject Image classification en
dc.subject Object detection en
dc.subject Knowledge distillation en
dc.subject Panoptic segmentation en
dc.subject Deep neural networks en
dc.subject Meta-learning en
dc.subject Metric learning en
dc.subject Image processing en
dc.subject Indlela yokufunda yomshini zu
dc.subject Ukuhlukaniswa kwemifanekiso zu
dc.subject Ukutholwa kwento zu
dc.subject Ukuhluzwa kolwazi zu
dc.subject Indlela yokuhlukanisa umfanekiso osetshenziselwa imisebenzi yokubona yekhompyutha zu
dc.subject Ikilasi lokufunda ngomshini zu
dc.subject Ukufunda ukufunda zu
dc.subject Ukufunda umsebenzi webanga phezu kwezinto zu
dc.subject Inqubo yomfanekiso zu
dc.subject Tlhopho ya diswantsho nso
dc.subject Temogo ya dilo nso
dc.subject Phetiso ya tsebo nso
dc.subject Karogantsho ya dilo nso
dc.subject Dineteweke tsa nyurale ye e tseneletsego nso
dc.subject Thuto ya metriki nso
dc.subject Peakanyo ya diswantsho nso
dc.subject.ddc 006.37
dc.subject.lcsh Computer vision en
dc.subject.lcsh Machine learning en
dc.subject.lcsh Neural networks (Computer science) en
dc.subject.lcsh Image processing -- Digital techniques en
dc.subject.lcsh Image analysis -- Data processing en
dc.subject.lcsh Detectors en
dc.title Few-shot learning for image classification and object detection en
dc.title.alternative Indlela yokufunda yomshini yokuhlukaniswa kwemifanekiso nokutholwa kwento zu
dc.title.alternative Few-shot learning ya tlhopho ya diswantsho le temogo ya dilo nso
dc.type Thesis en
dc.description.department School of Computing en
dc.description.degree Ph. D. (Computer Science)


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search UnisaIR


Browse

My Account

Statistics