Institutional Repository

Training strategies for architecture-specific recurrent neural networks

Show simple item record

dc.contributor.author Ludik, J
dc.contributor.author Cloete, I
dc.date.accessioned 2018-05-30T13:37:56Z
dc.date.available 2018-05-30T13:37:56Z
dc.date.issued 1995
dc.identifier.citation Ludik J & Cloete I (1995) Training strategies for architecture-specific recurrent neural networks. South African Computer Journal, Number 14, 1995 en
dc.identifier.issn 2313-7835
dc.identifier.uri http://hdl.handle.net/10500/24174
dc.description.abstract The typical approach taken by other researchers to address the defects of standard backpropagation is to investigate alternative methods for selecting initial parameter values, or for adjusting parameter values during BP training. We have suggested that the training strategy, that is the method for presentation of examples to the network during learning according to some performance criteria, is a viable alternative method to produce an effective solution within a feasible time. The dual purpose was to evaluate training strategies and architecture-specific recurrent neural networks (ASRNNs) simultaneously for different applications, varying in complexity and ranging from sequence recognition to sequence generation. It was demonstrated with six different ASRNNs and feedforward networks that for several applications, Increased Complexity Training (JCT) outperformed Combined Subset Training (CST) and Fixed Set Training (FST). We have also compared several ASRNNs for the Counting and Addition tasks, and found the Output-to-Hidden Hidden-to-Hidden ASRNN, one of our proposed ASRNNs, to be the most superior architecture, outperforming the conventional ASRNNs between 44% and 87% for the training strategies. For both JCT and CST the method of determining the required RMS termination criteria per subset was unsatisfactory since it required experimentation in stead of being performed algorithmically. In answer to the latter need, we have proposed incremental training strategies, Incremental Subset Training (/ST) and Incremental Increased Complexity Training (llCT), which improved the convergence rate compared to CST, FST, and even JCT. We have also proposed six Delta training strategies by first employing the Delta Ranking Method, which determine the complexity relation between the input patterns by obtaining their inter-pattern distances and then ranking them according to some scheme. We have introduced three basic ranking schemes which led to Smallest Delta Subset Training (SDST), Largest Delta Subset Training (WST), Alternating Delta Subset Training (ADST), their incremental versions, and also their epoch versions. All the Delta training strategies proved to be very effective (evaluated with different applications) in reducing the training time when compared to the conventional strategies. Incremental Delta training strategies performed the best overall, signalling that the ordering of training patterns according to our proposed delta ranking schemes, especially when presented in an incremental fashion, forces the network to discriminate between classes early in the training process, leading to reduced training time. The training strategies should be regarded as different tools in a training strategy toolbox, each one suited for its particular purpose. en
dc.language.iso en en
dc.publisher South African Computer Society (SAICSIT) en
dc.subject Training strategies en
dc.subject Architecture-specific recurrent neural networks en
dc.subject Incremental en
dc.subject Increased complexity en
dc.title Training strategies for architecture-specific recurrent neural networks en
dc.type Article en


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search UnisaIR


Browse

My Account

Statistics