Deep learning has successfully been applied in computer vision, including in image classification,
object recognition and detection, and in image segmentation in applications
such as remote sensing, scene understanding, autonomous driving, medical image analysis,
robotics and video surveillance. The drawback of the majority of current approaches
is that they demand huge quantities of annotated training data to produce results, and they
use quite expensive computing resources. Data annotation is usually an expensive and
tedious task. On the other hand, data can be rare or difficult to gather for some reasons,
including for safety and ethical issues. Moreover, a deep learning model trained successfully
for a specific task cannot be directly deployed for another task in another domain.
It is therefore essential to develop models that can learn from few annotated samples of
training data like humans do. Few-shot learning addresses the problem of closing the
gap into deep learning models that learn from huge annotated datasets and humans in the
challenging task of learning from few examples. The aim of this thesis is to propose novel
methods in deep learning image processing that optimize the model’s ability to detect and
recognise new object instances using few labelled data.
We present several novel methods that tackle the problems of image classification, object
detection, self-supervised knowledge distillation, and panoptic segmentation in fewshot
learning settings. Even though multiple computer vision themes can be identified
throughout this work, the most important is the limited data regime taken into account.
We consider the few-shot learning setting where tasks associated with their support and query test data are received and trained in episodes. We introduce a novel few-shot metalearning
classification model that consists of multiple learners supervised by a central
controller to control a feature extraction and meta-learning for integrated inference and
generalisation. Secondly, we introduce an approach for few-shot object detection that
meta-learns object localisation and classification by eliminating region-wise prediction,
and encoding support images and query images simultaneously into class-specific feature
representations that automatically enters into a class-agnostic decoder to generate
output predictions for the categories known beforehand. We also introduce a fully convolutional
model for panoptic segmentation in few-shot settings that encodes each instance
into a specific kernel and generates a prediction by convolutions directly, thereby predicting
both instance objects and background stuff together. In this way, instance-aware
and semantically consistent properties for object instances and their background can be
respectively satisfied in a unified workflow. Finally, we introduce a two-stage knowledge
distillation model that maximises the entropy of the feature embeddings of images using a
self-supervised auxiliary loss. Experiments on some public few-shot learning benchmark
datasets such as miniImageNet, Omniglot, COCO-20i and Mapillary Vistas demonstrate
the effectiveness of the proposed methods for few-shot learning in computer vision.
Le ndaba ende yethula futhi ithuthukise izifanekiso zamanoveli amaningana abhekana
nezinkinga zokubono zekhompyutha zokuhlukaniswa kwemifanekiso, ukutholwa kwezinto,
ukuhluzwa kolwazi oluzigadile kanye nendlela yokuhlukanisa umfanekiso osetshenziselwa
imisebenzi yokubona yekhompyutha eqoqweni lendlela yokufunda yomshini. Indlela
yokufunda yomshini ihlose ukuvala igebe phakathi kwezifanekiso zokufunda ezijulile
ezifunda eqoqweni elikhulu lemininingwane yolwazi ehlobene, ezinezichasiselo
nakubantu emsebenzini oyinselele wokufunda ezibonelweni ezimbalwa ezinezichasiselo.
Ngisho noma izindikimba eziningi zokubono zekhompyutha zingabonakala kuwo wonke
lo msebenzi, okubaluleke kakhulu uhlelo lwemininingwane olulinganiselwe olucatshangelwayo.
Sicabangela iqoqo lendlela yokufunda yomshini lapho imisebenzi yokubona yekhompuyutha
inemininingwane elinganiselwe ehlotshaniswa nokusekelwa kwayo kanye neminingwane
yokuhlola imibuzo iyatholwa futhi iqeqeshwe ngeziqephu. Okokuqala sethula
isifanekiso sokuhlukaniswa kokufunda ukufunda kwenoveli okumbalwa okuhlanganisa
abafundi abaningi abagadwe yisilawuli esimaphakathi ukuze kulawulwe ukukhishwa kwesici
kanye nokufunda ukufunda ukuze kufinyelelwe ekucabangeni okudidiyelwe kanye nokujwayelekile,
Okwesibili, sethula indlela yokuthola izinto zendlela yokufunda yomshini
ethola futhi ebona izimo zento entsha ngokufunda ukufunda into ibe yasendaweni kanye
nokuhlukaniswa ngendlela ebumbene, ngokususa ukuqagela okuhlakaniphile kwesifunda kanye nokubhala ngekhodi kokubili imifanekiso esekelayo nemibuzo yemifanekiso kube
isigaba esithize. sezici ezibese zingena kudivayisi ejwayelekile yesigaba ukuze sikhiqize
izibikezelo zezigaba ezithile. Siphinde sethula isifanekiso somphumela wokuhlunga inhloso
evamile yemifanekiso ngokugcwele ngendlela yokuhlukanisa umfanekiso osetshenziselwa
imisebenzi yokubona yekhompyutha eqoqweni lendlela yokufunda yomshini elihlanganisa
isenzakalo ngasinye sibe uhlamvu oluthile futhi sikhiqize ukubikezela ngokuhlunga
inhloso evamile yemifanekiso ngokuqondile, ngaleyo ndlela ibikezele kokubili izinto
eziyisibonelo nezinto zasemuva ndawonye. Ngale ndlela, izakhiwo eziqaphelayo nezingaguquguquki
ngokwezibalo zezenzakalo zento kanye nengemuva lazo zinganeliswa ngokulandelana
kwazo ekuhambeni komsebenzi okuhlangene. Ekugcineni, sethula isifanekiso
sezigaba ezimbili zolwazi oluhluziwe esenza isimo sokuphazamiseka sibe sikhulu sesici
esishumekiwe semifanekiso kusetshenziswa ukulahlekelwa komsizi ozigadile. Izivivinyo
kwamanye amaqoqo emininingwane endinganiso yendlela yokufunda yomshini ezinjengeminiImageNet,
i-Omniglot, i-CIFAR-FS ne-Oxford Flowers102 yokuhlukaniswa kwemifanekiso,
i-Pascal 5i ne- COCO-20i yokuthola into, kanye ne-Mapillary Vistas yokuhlukaniswa
kwendlela yomfanekiso osetshenziselwa imisebenzi yokubona yekhompyutha
ibonisa imingcele yokusebenza kanye nempumelelo yezindlela ezihlongozwayo zendlela
yokufunda yomshini. Le ndaba ende ihlose ukuvala igebe phakathi kokufunda okujulile
okujwayelekile nokufunda komuntu ngokudala izinhlelo zokubona zekhompyutha ezifunda
ezibonelweni ezimbalwa zemininingwane yemifanekiso.
Thesese ye e laetsa le go somisa mekgwa e meswa e mmalwa yeo e somanago le
mathata a pono ya khomphutha a tlhopho ya diswantsho, temogo ya dilo, phetiso ya tsebo
ya boitekolo le karogantsho ya dilo ka maemong a few-shot learning. Maikemisetso a
few-shot learning ke go tswalela sekgoba magareng ga mehuta ya go tsenelela ya go ithuta
yeo e hwetsago tsebo go tswa ditlhalosong tse di filwego tsa dihlopha tsa tshedimoso
le batho ka mosomo o boima wa go ithuta ka mehlala ye mmalwa ye e hlalositswego.
Le ge dikgwekgwe tsa pono ya dikhomphutha tse ntsi di ka bonwa mosomong wo ka
moka, se bohlokwa kudu ke mokgwa wo o lekaneditswego wa datha wo o etswego hloko.
Re ela hloko maemo a few-shot learning moo mesongwana ya pono ya khomphutha ya
go ba le datha ya thekgo ya yona le datha ya teko ya dipotsiso di amogetswe gape ka
ditiragalo. Re thoma ka go tsebagatsa mmotlolo wa tlhopho ya few-shot learning le metalearning
tseo di nago le baithuti ba bantsi bao ba hlokomelwago ke molaodi wa bogare
go laola go tloswa ga dilo le meta-learning sephetho se se kopantswego le kakaretso.
Sa bobedi, re tsebisa mokgwa wa temogo ya dilo wa few-shot wo o lemogang le go
amogela mehlala ya dilo tse diswa ka meta-learning le tlhopho ka mokgwa wa botee, ka
go tlosa kakanyo ya kgakanego le go fetolela diswantsho tseo di thekgang le diswantsho
tsa potsiso go dibopego tsa karolo ye e ikgethileng go karolo ya tlhathollo ya go se kgodise
go tsweletsa dikakanyo tsa dikarolo tse itsego. Re tsebisitse gape mokgwa wo o feletsego
wa go latela dilo tsa go raragana ka go somisa few-shot learning go romela dilo lifelong le itsego gomme tsa tsweletsa kakanyo ya dilo thwii, gomme ya fa kakanyo ya ditiragalo
tsa dilo tse pedi mmogo. Ka tsela ye, mehlala ya mokgwa wo le dilo tseo di latelanago tsa semanthiki tsa mesomo ya dilo le botso bja tsona di ka latelana gabotse go mesomo ye e kopantswego. Mafelelong re tsebisa mmotlolo wa phetiso ya tsebo wa dikgato tse pedi woo o kaonafatsago maemo a go raragana ga dilo tse di lokelwago diswantshong ka go somisa tahlegelo ya thekgo ya boihlokomelo. Boitekelo godimo ga dihlopha tsa datha tse dingwe tsa bohle tsa tekanetso ya few-shot learning bjale ka miniImageNet, Omniglot, CIFAR-FS le Oxford Flowers102, Pascal 5i le COCO-20i ya temogo ya dilo, le Mapillary Vistas ya karogantsho ya dilo go laetsa magomo a tshepediso le go soma gabotse ga mekgwa ye e sisintswego ya few-shot learning. Maikemisetso a thesese ye ke go tswalela sekgoba magareng ga thuto ye e tseneletsego ya tlwaelo le go ithuta ga batho ka go hlama mananeo a pono a khomphutha ao a ithutago go tswa mehlaleng ye mmalwa ya datha ya diswantsho.